巴西专利BR112012002815B1 computer implemented method of processing a visual query, search engine system for processing a visu

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
computer implemented method of processing a visual query, search engine system for processing a visual query, and non-temporary computer-readable storage media for a visual query, such as a photograph, a screenshot, a scanned image, a video frame or an image created by a content authoring application, is submitted to a visual query search system. the search system processes the visual query by sending it to a plurality of parallel search systems, each implementing a distinct visual query search process. these parallel search systems can include, but are not limited to, optical character recognition (ocr), facial recognition, product recognition, barcode recognition, object recognition or object category, named entity recognition and color recognition. then, at least one search result is sent to the client system. in some embodiments, when the visual query is an image containing a textual element and a non-textual element, at least one search result includes an optical character recognition result for the textual element and at least one image matching result for the non-text element.
公开号:BR112012002815B1
申请号:R112012002815
申请日:2010-08-05
公开日:2020-06-09
发明作者:Petrou David
申请人:Google Inc；Google Llc；
IPC主号:

专利说明:

“METHOD IMPLEMENTED BY COMPUTER FOR PROCESSING A VISUAL CONSULTATION, SEARCH MOTOR SYSTEM FOR PROCESSING A VISUAL CONSULTATION, AND MIDI A FOR NON-TEMPORARY STORAGE READABLE BY 5 COMPUTER”
FIELD OF THE INVENTION
The disclosed modalities relate, in general, to a server system architecture that encompasses a plurality of parallel search systems for processing a visual query.
BACKGROUND OF THE INVENTION
A search based on text or based on term, in which a user inserts a word or phrase in a search engine and receives a variety of results is a tool used for search. However, * term-based queries require a user to be able to enter a relevant term. Sometimes, a user may wish to know information about an image. For example, a user may wish to know a person's name in a photograph, or a user may wish to know a flower or bird's name in a picture. In this way, a system that can receive a visual query and provide search results will be desirable.
SUMMARY OF THE INVENTION
According to some modalities, there is a method of processing a visual query implemented on a computer on a server system. A visual query is received from a client system. The visual query is processed by sending the visual query to a plurality of parallel search engines for simultaneous processing. Each of the plurality of search engines implements a visual query search process distinct from a plurality of visual query search processes. The plurality of visual consultation search processes includes at least: optical character recognition (OCR), facial recognition and a first image consultation process other than OCR and facial recognition. A plurality of search results is received from one or more of the plurality of parallel search engines. At least one of the plurality of search results is sent to the client system.
In some modalities, the method additionally includes, when at least two of the received search results meet predefined criteria, rank the received search results that meet the predefined criteria and send at least one search result from the ranked search results to the client system.
In some modalities, the first image consultation process is product recognition, barcode recognition, object recognition or object category, named entity recognition or color recognition.
In some modalities, the visual consultation is a photograph, a screenshot, a scanned image or a video frame. The client system can be a mobile device, a desktop device or another device.
In some modalities, the visual query is received from a client application executed by the client system, such as a search application, a search engine plug-in for a browser application or a search engine extension for an application browser In some modalities, the visual consultation is received from a content authoring application executed by the client system.
When the visual query is an image that contains a textual element and a non-textual element, in some modalities, the search result includes an optical character recognition result for the textual element and at least one image matching result for the element not textual.
In some embodiments, when the visual query is an image that contains a textual element and a non-textual element, the search result includes an interactive results document that comprises a first visual identifier for the textual element with a link to a search result produced by an optical character recognition process and a second visual identifier for the non-textual element with a link to a search result produced by an image matching process.
In some embodiments, the method additionally includes combining at least two of the plurality of search results into a composite search result.
According to some modalities, a search engine system is provided for processing a visual query. The system includes one or more central processing units for executing programs and memory that stores one or more programs to be executed by one or more central processing units. The one or more programs include instructions for performing the following. A visual query is received from a client system. The visual query is processed by sending the visual query to a plurality of parallel search engines for simultaneous processing. Each of the plurality of search engines implements a visual query search process distinct from a plurality of visual query search processes. The plurality of visual consultation search processes includes at least: optical character recognition (OCR), facial recognition and a first image consultation process other than OCR and facial recognition. A plurality of search results is received from one or more of the plurality of parallel search engines. At least one of the plurality of search results is sent to the client system. A system like this can also include program instructions for executing the additional options discussed above.
According to some modalities, a computer-readable storage medium for processing a visual query is provided. The computer-readable storage media stores one or more programs configured to run by a computer, the one or more programs comprising instructions for performing the following. A visual query is received from a client system. The visual query is processed by sending the visual query to a plurality of parallel search engines for simultaneous processing. Each of the plurality of search engines implements a visual query search process distinct from a plurality of visual query search processes. The plurality of visual consultation search processes includes at least: optical character recognition (OCR), facial recognition and a first image consultation process other than OCR and facial recognition. A plurality of search results is received from one or more of the plurality of parallel search engines. At least one of the plurality of search results is sent to the client system. Computer-readable storage media like this can also include program instructions for performing the additional options discussed above.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram illustrating a computer network that includes a visual query server system.
Figure 2 is a flow chart that illustrates the process for responding to a visual consultation according to some modalities.
Figure 3 is a flow chart that illustrates the process for responding to a visual consultation with an interactive results document according to some modalities.
Figure 4 is a flow chart that illustrates the communications between a client and a visual consultation server system according to some modalities.
Figure 5 is a block diagram that illustrates a client system according to some modalities.
Figure 6 is a block diagram illustrating a visual query processing server system in an initial interface according to some modalities.
Figure 7 is a block diagram that illustrates a generic system of the parallel search systems used to process a visual query according to some modalities.
Figure 8 is a block diagram that illustrates an OCR search system used to process a visual query according to some modalities.
Figure 9 is a block diagram that illustrates a facial recognition search system used to process a visual consultation according to some modalities.
Figure 10 is a block diagram that illustrates an image search system by terms used to process a visual query according to some modalities.
Figure 11 illustrates a client system with a screenshot of an exemplary visual consultation according to some modalities.
Figures 12A and 12B each illustrate a client system with a screen capture of an interactive results document with containment boxes according to some modalities.
Figure 13 illustrates a client system with a screen capture of an interactive results document that is coded by type according to some modalities.
Figure 14 illustrates a client system with a screen capture of an interactive results document with labels according to some modalities.
Figure 15 illustrates a screen capture of an interactive results document and visual query displayed concurrently with a list of results according to some modalities.
Equal reference numbers refer to corresponding parts throughout the drawings.
DESCRIPTION OF MODALITIES
Now, reference will be made in detail to the modalities, examples of which are illustrated in the attached drawings. In the following detailed description, numerous specific details are presented in order to provide a thorough understanding of the present invention. However, it will be apparent to those skilled in the art that the present invention can be practiced without these specific details. In other cases, well-known methods, procedures, components, circuits, and networks have not been described in detail so as not to unnecessarily obscure aspects of the modalities.
It is also understood that, although the terms first, second, etc. can be used here to describe various elements, these elements should not be limited by these terms. These terms are used only to distinguish one element from another. For example, a first contact can be called a second contact and, similarly, a second contact can be called a first contact, without departing from the scope of the present invention. Both the first contact and the second contact are contacts, but they are not the same contact.
Here, the terminology used in describing the invention is for the purpose of describing particular modalities only and is not intended to limit the invention. As used in the description of the invention and the appended claims, it is intended that the singular forms one, one, o and a also include plural forms, unless the context clearly indicates otherwise. It is also understood that the term e / or, as used herein, refers to and encompasses any and all possible combinations of one or more of the associated listed items. It is further understood that the terms comprise and / or comprise, when used in this specification, specify the presence of declared resources, integers, steps, operations, elements and / or components, but do not preclude the presence or addition of one or more others resources, integers, steps, operations, elements, components and / or groups of these.
As used here, the term can be interpreted to mean when or through or in response to determination or in response to detection, depending on the context. Similarly, the phrase if determined or detected can be interpreted to mean by determination or in response to determination or upon detection (the stated condition or event) or in response to detection (the stated condition or event), depending on the context.
Figure 1 is a block diagram illustrating a computer network that includes a visual consultation server system according to some modalities. Computer network 100 includes one or more client systems 102 and a visual query server system 106. One or more communications networks 104 interconnect these components. Communications network 104 can be any of a variety of networks, including local area networks (LAN), wide area networks (WAN), wireless networks, wired networks, the Internet, or a combination of such networks.
The client system 102 includes a client application 108, which is executed by the client system, to receive a visual query (for example, visual query 1102 of figure 11). A visual query is an image that is submitted as a query to a search engine or search engine. Examples of visual queries include, without limitation, photographs, scanned documents and images and drawings. In some embodiments, client application 108 is selected from the suite consisting of a search application, a search engine plug-in for a browser application and a search engine extension for a browser application. In some embodiments, the client application 108 is an omnivorous search box, which allows a user to drag and drop any image format inside the search box to be used as the visual query.
A client system 102 sends queries and receives data from the visual query server system 106. The client system 102 can be any computer or other device that can communicate with the visual query server system 106. Examples include, without limitation, desktops and laptops, large computers, server computers, mobile devices such as cell phones and personal digital assistants, network terminals and integrated receivers / decoders.
The visual query server system 106 includes a visual query processing server on initial interface 110. Initial interface server 110 receives a visual query from client 102 and sends the visual query to a plurality of parallel search engines 112 for simultaneous processing. Each of the search engines 112 implements a distinct visual query search process and accesses its corresponding databases 114, as needed, to process the visual query by its distinct search process. For example, a 112-A face recognition search system will access a 114-A facial image database to search for facial matches in relation to the image query. As will be explained in more detail in relation to figure 9, if the visual query contains a face, the facial recognition search system 112-A will resume one or more search results (for example, names, corresponding faces, etc.) a from the 114-A facial image database. In another example, the 112-B optical character recognition (OCR) search system converts any recognizable text in the visual query into text for return as one or more search results. In the 112-B Optical Character Recognition (OCR) search system, an OCR 114-B database can be accessed to recognize particular fonts or text patterns as explained in more detail in relation to Figure 8.
Any number of parallel search engines 112 can be used. Some examples include a 112-A facial recognition search system, a 112-B OCR search system, a 112-C term image search system (which can recognize an object or a category of object), a search system product recognition search (which can be configured to recognize 2D images, such as book covers and CDs, and can also be configured to recognize 3D images, such as furniture), barcode recognition search system (which recognizes ID and 2D style barcodes), a named entity recognition search system, landmark recognition (which can be configured to recognize particular famous landmarks, such as the Eiffel Tower, and can also be configured to recognize a body of specific images, such as advertising panels), location recognition aided by geolocation information provided by a GPS receiver in the client system 102 or cellular network, a search system for color recognition and a similar image search system (which searches for and identifies images similar to a visual query). Additional search engines can be added as additional parallel search engines represented in figure 1 by the 112-N system. Here, all search engines, except the OCR search engine, are collectively defined as search engines that perform an image matching process. All search engines that include the OCR search engine are collectively referred to as image search engines. In some embodiments, the visual query server system 106 includes a facial recognition search system 112-A, an OCR search system 112-B and at least one other image query search system 112.
Each of the parallel search engines 112 individually processes the visual search query and returns its results to the home interface server system 110. In some embodiments, the home interface server 100 can perform one or more analyzes on search results, such as one or more of: aggregate the results in a composite document, choose a subset of the results to display and rank the results, as will be explained in more detail in relation to figure 6. The initial interface server 110 communicates the search results to the system customer 102.
The client system 102 presents the user with one or more search results. The results can be presented on a screen, through an audio speaker or any other device used to communicate information to a user. The user can interact with the search results in a variety of ways. In some modalities, the selections, annotations and other user interactions with the search results are transmitted to the visual query server system 106 and recorded together with the visual query in a query and annotation database 116. Information in the database query and annotation can be used to improve visual query results. In some modalities, information from the consultation and annotation database 116 is periodically sent to parallel search systems 112, which incorporate all relevant parts of the information in their respective individual databases 114.
Computer network 100 optionally includes a term query server system 118 for performing searches in response to term queries. A term query is a query that contains one or more terms, as opposed to a visual query, that contains an image. The term query server system 118 can be used to generate search results that complement the information produced by the various search engines in the visual query server system 106. Results taken from the term query server system 118 can include any Format. The term query server system 118 can include textual documents, images, video, etc. Although the term query server system 118 is shown as a separate system in Figure 1, optionally, the visual query server system 106 may include a term query server system 118.
Additional information on the operation of the visual consultation server system 106 is provided below in relation to the flowcharts of figures 2-4.
Figure 2 is a flow chart illustrating a method of the visual query server system for responding to a visual query according to certain embodiments of the invention. Each of the operations shown in figure 2 can correspond to the instructions stored in a computer memory or computer-readable storage media.
The visual query server system receives a visual query from a client system (202). The client system, for example, can be a desktop computing device, a mobile device or another similar device (204), as explained in relation to figure 1. An exemplary visual query on an exemplary client system is shown in the figure 11.
Visual consultation is an image document of any suitable format. For example, the visual query can be a photograph, a screenshot, a scanned image or a frame or a sequence of multiple frames in a video (206). In some modalities, visual consultation is a design produced by a content authoring program (736, figure 5). As such, in some modalities, the user designs the visual consultation, while in other modalities, the user digitizes or photographs the visual consultation. Some visual queries are created using an image generation application, such as Acrobat, a photo editing program, a drawing program or an image editing program. For example, a visual query can come from a user who takes a picture of his friend on his cell phone and then submits the photo as a visual query to the server system. The visual query can also come from a user who scans a magazine page or takes a screenshot of a web page on a desktop computer and then submits the scan or screenshot as the visual query to the server system . In some embodiments, the visual query is submitted to the server system 106 through a search engine extension of a browser application, through a plug-in for a browser application or by a search application executed by the client system 102. Visual queries can also be submitted by other application programs (executed by a client system) that support or generate images that can be transmitted to a server remotely located by the client system.
The visual query can be a combination of textual and non-textual elements (208). For example, a query can be a scan of a magazine page that contains images and text, such as a person standing next to a traffic sign. A visual consultation can include an image of a person's face, whether taken by a camera embedded in the client system or in a scanned document or otherwise received by the client system. A visual query can also be a scan of a document that contains only text. The visual consultation can also be an image of countless different subjects, such as several birds in a forest, a person and an object (for example, car, park bench, etc.), a person and an animal (for example, animal domestic animal, farm animal, butterfly, etc.). Visual queries can have two or more distinct elements. For example, a visual query can include a bar code and an image of a product or product name on a product package. For example, the visual query can be a picture of a book cover that includes the book title, cover art and a bar code. In some cases, a visual query will produce two or more distinct search results corresponding to different parts of the visual query, as discussed in more detail below.
The server system processes the visual query, as follows. The initial interface server system sends the visual query to a plurality of parallel search engines for simultaneous processing (210). Each search engine implements a distinct visual query search process, that is, an individual search engine processes the visual query by its own processing scheme.
In some modalities, one of the search engines to which the visual query is sent for processing is a search engine for optical character recognition (OCR). In some modalities, one of the search systems to which the visual query is sent for processing is a facial recognition search system. In some modalities, the plurality of search engines that perform different visual query search processes includes at least: optical character recognition (OCR), facial recognition and another image consultation process other than OCR and facial recognition (212 ). The other image query process is selected from a set of processes that includes, but is not limited to, product recognition, barcode recognition, object recognition or object category, named entity recognition and color recognition ( 212).
In some modalities, named entity recognition occurs as a post-process of the OCR search system, in which the result of the OCR text is analyzed in relation to famous people, places and objects, and the like, and then the identified terms how named entities are searched in the query server system by term (118, figure 1). In other modalities, images of landmarks, logos, people, album covers, trademarks, etc. famous people are recognized by a term search engine. In other modalities, a separate named entity image query process separate from the term image search system is used. The object recognition system or object category recognizes generic result types, such as a car. In some modalities, this system also recognizes product brands, particular product models and the like, and provides more specific descriptions, such as Porsche. Some of the search engines can be special user-specific search engines. For example, particular versions of color recognition and facial recognition can be special search engines used by the blind.
The initial interface server system receives results from the parallel search systems (214). In some modalities, the results are accompanied by a search score. For some visual queries, some of the search engines will not find relevant results. For example, if the visual query was a picture of a flower, the facial recognition search system and the barcode search system will not find any relevant results. In some modalities, if no relevant results are found, a search score of zero or zero is received from this search system (216). In some embodiments, if the initial interface server does not receive any results from a search engine after a pre-defined period of time (for example, 0.2, 0.5, 1, 2 or 5 seconds), it will process the received results as if this time-out server had produced a null search score and will process the received results from the other search engines.
Optionally, when at least two of the search results received meet predefined criteria, they are ranked (218). In some modalities, one of the predefined criteria excludes empty results. A predefined criterion is that the results are not empty. In some modalities, one of the pre-defined criteria excludes results with a numerical score (for example, for a relevance factor) that falls below a pre-defined minimum score. Optionally, the plurality of search results is filtered (220). In some modalities, results are filtered only if the total number of results exceeds a predefined limit. In some modalities, all results are ranked, but results that fall below a pre-defined minimum score are excluded. For some visual queries, the content of the results is filtered. For example, if any of the results contain private information or protected personal information, these results are filtered out.
Optionally, the visual query server system creates a composite search result (222). One such modality is when more than one search result system is embedded in an interactive results document, as explained in relation to figure 3. The term query server system (118, figure 1) can increase the results from one of the parallel search systems with results from a term search, in which the additional results are both links to documents or sources of information and text and / or images that contain additional information that may be relevant to the visual query. Thus, for example, the composite search result may contain an OCR result and a link to an entity named in the OCR document (224).
In some modalities, the OCR search system (112-B, figure 1) or the visual query processing server in the initial interface (110, figure 1) recognize words that are likely to be relevant in the text. For example, they can recognize named entities, such as famous people or places. Named entities are submitted as query terms to the query server system by term (118, figure 1). In some modalities, the query results by term produced by the query server system by term are incorporated into the result of the visual query as a link. In some modalities, the query results by term are taken up as separate links. For example, if a figure on a book cover was the visual query, it is likely that an object recognition search system will produce a high score score for the book. As such, a query by term for the title of the book will be executed in the query server system by term 118 and the query results by term are resumed together with the results of the visual query. In some modalities, the query results by term are presented in a labeled group to distinguish them from the results of the visual query. Results can be searched individually, or a search can be performed using all named entities recognized in the search query to produce additional particularly relevant search results. For example, if the visual query is a digitalized travel guide about Paris, the result returned may include connections to the 118 term query server system to initiate a search for a Notre Dame term query. Similarly, compound search results include results from text searches for famously recognized images. For example, in the same travel guide, dynamic links to query results by term for famous destinations shown as figures in the guide, such as the Eiffel Tower and Louvre, can also be shown (even if the terms Eiffel Tower and Louvre do not appear in the own guide).
Then, the visual query server system sends at least one result to the client system (226). Typically, if the visual query processing server receives a plurality of search results from at least some of the plurality of search engines, then it will send at least one of the plurality of search results to the client system. For some visual queries, only a search engine will retrieve relevant results. For example, in a visual query that contains only a text image, only the results from the OCR server can be relevant. For some visual queries, only one result from a search engine can be relevant. For example, only the product related to a scanned barcode can be relevant. In these cases, the visual processing server in the initial interface will only retrieve the relevant search result (s). For some visual queries, a plurality of search results is sent to the client system, and the plurality of search results includes search results from more than one of the parallel search systems (228). This can occur when more than one distinct image is in the visual query. For example, if the visual query was a picture of a person riding a horse, results of the person's facial recognition can be displayed along with the object identification results for the horse. In some modalities, all the results for a particular query by the image search system are grouped and presented together. For example, the first N facial recognition results are displayed under a facial recognition results topic and the first N object recognition results are displayed together under an object recognition results topic. Alternatively, as discussed below, search results from a particular image search system can be grouped by region of the image. For example, if the visual query includes two faces, both of which produce facial recognition results, the results for each face will be presented as a separate group. For some visual queries (for example, a visual query that includes an image of both text and one or more objects), search results can include both OCR results and one or more image matching results (230).
In some modalities, the user may wish to learn more about a particular search result. For example, if the visual query was a picture of a dolphin and the image search system because we used the following terms water, dolphin, blue and flipper, the user may wish to perform a search term by query based on text about Flipper. When the user wants to perform a search on a query by term (for example, as indicated by the user who clicks or otherwise selects a corresponding link in the search results), the query server system by term (118, figure 1) is accessed, and the search for the selected term (s) is performed. The corresponding search results by term are displayed in the client system both separately and together with the results of the visual query (232). In some embodiments, the visual query processing server in the initial interface (110, figure 1) automatically chooses (that is, without receiving any user command other than the initial visual query) one or more main potential text results for the query visual, executes these text results in the query server system for term 118 and then resumes these query results by term together with the result of the visual query to the client system as a part of sending at least one search result to the system customer (232). In the above example, if Flipper was the first term result for the figure of a dolphin's visual query, the initial interface server performs a term query on Flipper and resumes these query results by term together with the results of the visual query to the client system. This modality, in which a term result that is considered as probably selected by the user is automatically executed before sending the search results of the visual query to the user, saves the user time. In some modalities, these results are displayed as a compound search result (222), as explained above. In other modalities, the results are part of a search result list instead of, or in addition to, a composite search result.
Figure 3 is a flow chart that illustrates the process for responding to a visual consultation with an interactive results document. The first three operations (202, 210, 214) are superscribed in relation to figure 2. From the search results that are received from the parallel search systems (214), an interactive results document is created (302).
The creation of the interactive results document (302) will now be described in detail. For some visual queries, the interactive results document includes one or more visual identifiers from the respective subparts of the visual query. Each visual identifier has at least one user-selectable link to at least one of the search results. A visual identifier identifies a respective subpart of the visual query. For some visual queries, the interactive results document has only a visual identifier with a user-selectable link to one or more results. In some modalities, a respective user-selectable link to one or more of the search results has an activation region, and the activation region corresponds to the subpart of the visual query that is associated with a corresponding visual identifier.
In some embodiments, the visual identifier is a containment box (304). In some embodiments, the containment box confines a subpart of the visual query, as shown in figure 12A. The containment box does not have to be a square or rectangular box, but it can have any type of shape, including circular, oval, conformal (for example, in relation to an object, entity or region of the visual query), irregular or any otherwise, as shown in figure 12B. For some visual queries, the containment box outlines the boundary of an identifiable entity in a subpart of the visual query (306). In some embodiments, each containment box includes a user-selectable link to one or more search results, where the user-selectable link has an activation region corresponding to a subpart of the visual query surrounded by the containment box. When the space inside the confinement box (the user-selectable link activation region) is selected by the user, search results that match the image in the outlined subpart are resumed.
In some embodiments, the visual identifier is a label (307), as shown in figure 14. In some embodiments, the label includes at least one term associated with the image in the respective subpart of the visual query. Each label is formatted for presentation in the interactive results document in or near the respective subpart. In some embodiments, the labels are color-coded.
In some modalities, each of the respective visual identifiers is formatted for presentation in a visually distinctive way according to a type of entity recognized in the respective subpart of the visual consultation. For example, as shown in figure 13, each of the containment boxes around a product, a person, a trademark and the two textual areas are presented with different hatch patterns, representing differently colored transparent containment boxes. In some embodiments, visual identifiers are formatted for presentation in visually distinctive ways, such as overlay color, overlay pattern, label background color, label background pattern, label font color and border color.
In some embodiments, the user-selectable link in the interactive results document is a link to a document or object that contains one or more results related to the corresponding subpart of the visual query (308). In some modalities, at least one search result includes data related to the corresponding subpart of the visual query. As such, when the user selects the selectable link associated with the respective subpart, the user is directed to the search results corresponding to the entity recognized in the respective subpart of the visual query.
For example, if a visual query was a photograph of a barcode, there may be parts of the photograph that are irrelevant parts of the packaging on which the barcode was affixed. The interactive results document can include a containment box around the bar code only. When the user selects the inside of the outlined barcode containment box, the barcode search result is displayed. The barcode search result can include a result, the name of the product corresponding to this barcode, or the barcode results can include several results, such as a variety of locations where this product can be purchased, analyzed, etc.
In some modalities, when the subpart of the visual query corresponding to a respective visual identifier contains text that comprises one or more terms, the search results corresponding to the respective visual identifier include results of a query search by term on at least one of the terms of the text. In some modalities, when the subpart of the visual query corresponding to the respective visual identifier contains the face of a person for whom at least one match (that is, a search result) has been found that satisfies pre-trustworthiness criteria (or others) defined, the search results corresponding to the respective visual identifier include one or more of: name, identifier, contact information, account information, address information, current location of a related mobile device associated with the person whose face is contained in the subpart selectable, other images of the person whose face is contained in the selectable subpart and potential image matches for the person's face. In some modalities, when the subpart of the visual query corresponding to the respective visual identifier contains a product for which at least one match (that is, a search result) has been found that meets predefined reliability (or other) criteria, the Search results corresponding to the respective visual identifier include one or more of: product information, a product evaluation, an option to initiate the purchase of the product, an option to initiate a product offer, a list of similar products and a list of related products.
Optionally, a respective user-selectable link in the interactive results document includes anchor text, which is displayed in the document without having to activate the link. Anchor text provides information, such as a key word or term, related to information obtained when the link is activated. Anchor text can be displayed as part of the label (307), in a part of a containment box (304) or as additional information displayed when a user hovers a cursor over a user-selectable link for a predetermined period of time , such as 1 second.
Optionally, a respective user-selectable link in the interactive results document is a link to a search engine to search for information or documents corresponding to a text-based query (sometimes called a term query here). Activating the link causes the search engine to execute the search, where the query and the search engine are specified by the link (for example, the search engine is specified by a URL in the link and the search query based on text is specified by a link URL parameter), with results returned to the client system. Optionally, the link in this example can include anchor text that specifies the text or terms in the search query.
In some embodiments, the interactive results document produced in response to a visual query may include a plurality of links that match the results from the same search engine. For example, a visual consultation can be an image or picture of a group of people. The interactive results document can include confinement boxes around each person, which, when activated, resume results from the facial recognition search system for each face in the group. For some visual queries, a plurality of links in the interactive results document correspond to search results from more than one search system (310). For example, if a picture of a person and a dog has been submitted as a visual query, confinement boxes in the interactive results document can outline the person and the dog separately. When the person (in the interactive results document) is selected, search results from the facial recognition search system are resumed and, when the dog (in the interactive results document) is selected, results from the image search system by terms are taken up. For some visual queries, the interactive results document contains an OCR result and an image matching result (312). For example, if a picture of a person standing next to a sign was submitted as a visual query, the interactive results document may include visual identifiers for the person and the text on the sign. Similarly, if a scan of a magazine was used as the visual query, the interactive results document can include visual identifiers for photographs or trademarks in advertisements on the page, as well as a visual identifier for the text of an article on this page as well.
After the interactive results document has been created, it is sent to the client system (314). In some modalities, the interactive results document (for example, document 1200, figure 15) is sent together with a list of search results from one or more parallel search systems, as discussed above in relation to figure 2. In In some modalities, the interactive results document is displayed on the client system above or otherwise adjacent to a list of search results from one or more parallel search systems (315), as shown in figure 15.
Optionally, the user will interact with the results document by selecting a visual identifier in the results document. The server system receives, from the client system, information regarding the User Selection of a visual identifier in the interactive results document (316). As discussed above, in some modalities, the connection is activated by selecting an activation region inside a confinement box. In other modalities, the connection is activated by a User Selection of a visual identifier of a subpart of the visual query, which is not a confinement box. In some embodiments, the linked visual identifier is a quick button, a label located near the subpart, an underlined word in the text, or another representation of an object or subject in the visual query.
In modalities where the list of search results is presented with the interactive results document (315), when the user selects a user-selectable link (316), the search result in the search results list corresponding to the selected link is identified. In some modalities, the cursor will jump or move automatically to the first result corresponding to the selected link. In some modalities in which client screen 102 is too small to display both the interactive results document and the entire search results list, Selecting a link in the interactive results document makes the search results list scroll or jump to display at least one first result corresponding to the selected link. In some other modalities, in response to the User selection of a link in the interactive results document, the results list is reordered in such a way that the first result corresponding to the link is displayed at the top of the results list.
In some modalities, when the user selects the user-selectable link (316), the visual query server system sends at least a subset of the results, related to a corresponding subpart of the visual query, to the client for display to the user (318). In some modalities, the user can select multiple visual identifiers concurrently and will receive a subset of the results for all selected visual identifiers at the same time. In other modalities, search results corresponding to the user-selectable links are preloaded on the client prior to User Selection of any of the user-selectable links to provide search results to the user virtually instantly in response to User Select one or more links in the interactive results document.
Figure 4 is a flow chart that illustrates the communications between a client and a visual query server system. Client 102 receives a visual consultation from a user / consultant (402). In some modalities, visual consultations can only be accepted from users who have subscribed or joined the visual consultation system. In some modalities, searches for facial recognition correspondences are performed only for users who have subscribed to the facial recognition visual consultation system, while other types of visual consultation are performed for anyone, regardless of whether they have joined the facial recognition part.
As explained, the format of the visual consultation can take many forms. The visual consultation will likely contain one or more subjects located in subparts of the visual consultation document. For some visual queries, the client system 102 performs pre-processing of type recognition in the visual query (404). In some embodiments, client system 102 seeks recognizable patterns in particular in this pre-processing system. For example, for some visual inquiries, the customer can recognize colors. For some visual queries, the customer may recognize that a particular subpart is likely to contain text (because this area is made up of small dark characters surrounded by light space, etc.) The customer can contain any number of preprocessing type recognition or type recognition modules. In some modalities, the customer will have a type recognition module (barcode recognition 406) to recognize barcodes. It can do this by recognizing the distinctive striped pattern in a rectangular area. In some embodiments, the customer will have a type recognition module (face detection 408) to recognize that a particular subject or subpart of the visual query is likely to contain a face.
In some modalities, the recognized type is returned to the user for verification. For example, client system 102 can resume a message that states that a barcode was found in your visual query, are you interested in receiving results from the barcode query . In some modalities, the message may still indicate the subpart of the visual query where the type was found. In some modalities, this presentation is similar to the interactive results document discussed in relation to figure 3. For example, it can outline a subpart of the visual query and indicate that the subpart is likely to contain a face, and ask the user if it is interested in receiving facial recognition results.
After client 102 performs the optional pre-processing of the visual query, the client sends the visual query to the visual query server system 106, specifically, to the visual query processing server in initial interface 110. In some embodiments, if the pre -processing produced relevant results, that is, if one of the type recognition modules produced results above a certain limit, indicating that the query or a subpart of the query is likely to be of a particular type (face, text, code bars, etc.), the customer will transfer information regarding the pre-processing results. For example, the customer may indicate that the face recognition module is 75% sure that a particular subpart of the visual query contains a face. More generally, preprocessing results, if any, include one or more values of the subject's type (for example, barcode, face, text, etc.). Optionally, the pre-processing results sent to the visual query server system include one or more of: for each subject type value in the preprocessing results, information that identifies a subpart of the visual query corresponding to the subject type value and, for each subject type value in the pre-processing results, a confidence value that indicates a level of confidence in the subject type value and / or the identification of a corresponding subpart of the visual query.
The initial interface server 110 receives the visual query from the client system (202). The visual query received may contain the pre-processing information discussed above. As explained, the initial interface server sends the visual query to a plurality of parallel search engines (210). If the initial interface server 110 received preprocessing information regarding the likelihood that a subpart contained a subject of a certain type, the initial interface server may transfer this information to one or more of the parallel search engines. For example, it can transfer information that a particular subpart is likely to be a face, so that the facial recognition search system 112-A can process this subsection of the visual query first. Similarly, sending the same information (that a particular subpart is likely to be a face) can be used by other parallel search engines to ignore this subpart or analyze other subparts first. In some embodiments, the initial interface server will not transfer the preprocessing information to the parallel search engines, but will instead use this information to increase the way in which it processes the results received from the parallel search engines .
As explained in relation to figure 2, for some visual queries, the initial interface server 110 receives a plurality of search results from the parallel search systems (214). Then, the home interface server can perform a variety of ranking and filtering and can create an interactive search result document, as explained in relation to figures 2 and 3. If the home interface server 110 received pre-processing information in relation to the probability that a subpart contained a subject of a certain type, it can filter and sort, giving preference to those results that correspond to the type of recognized subject pre-processed. If the user has indicated that a particular result type has been requested, the initial interface server will take user requests into account when processing the results. For example, the home interface server can filter out all other results if the user only requested barcode information, or the home interface server will list all results that relate to the requested type before listing the other results. If an interactive visual query document is resumed, the server can pre-fetch the links associated with the type of result that the user has indicated interest in, although only providing links to perform searches related to other subjects indicated in the interactive results document. Then, the initial interface server 110 sends the search results to the client system (226).
Client 102 receives the results from the server system (412). When applicable, these results will include results that correspond to the type of result found in the preprocessing stage. For example, in some embodiments, they will include one or more barcode results (414) or one or more facial recognition results (416). If the customer's pre-processing modules have indicated that a particular type of result was likely, and this result was found, the results found of that type will be listed prominently.
Optionally, the user will select or write down one or more of the results (418). The user can select a search result, select a particular search result type and / or select a portion of an interactive results document (420). The selection of a result is an implicit feedback that the returned result was relevant to the query. Such feedback information can be used in future query processing operations. An annotation provides explicit feedback on the resumed result that can also be used in future query processing operations. Annotations take the form of corrections to parts of the returned result (such as a correction of a word poorly recognized by OCR) or a separate annotation (both in free and structured form).
The user's selection of a search result, in general, selecting the correct result from several of the same type (for example, choosing the correct result from a facial recognition server), is a process that is referred to as a selection between interpretations. The user's selection of a particular type of search result, in general, selection of the type of result of interest from several different types of results returned (for example, choosing the OCR-recognized text of an article in a magazine instead of, also, the visual results for the ads on the same page), is a process that is referred to as disambiguation of intention. A user can similarly select particular linked words (such as recognized named entities) in a document recognized by OCR, as explained in detail in relation to figure 8.
The user may wish to alternatively or additionally note particular search results. This annotation can be done in a free form style or in a structured format (422). Annotations can be descriptions of the result or they can be analyzes of the result. For example, they can indicate the name of the subject (s) in the result or they can indicate this is a good book or this product broke within a year of purchase. Another example of an annotation is a confinement box designed by the user around a subpart of the visual query and text provided by the user that identifies the object or subject inside the confinement box. User notes are explained in more detail in relation to figure 5.
User selections from search results and other annotations are sent to the server system (424). Initial interface server 110 receives selections and annotations and processes them further (426). If the information was a selection of an object, sub-region or term in an interactive results document, additional information regarding this selection may be requested, as appropriate. For example, if the selection was for a visual result, more information about this visual result will be requested. If the selection was a word (both from the OCR server and from the image server by terms), a textual search for that word will be sent to the query server system by term 118. If the selection was a person from a search system for facial image recognition, this person's profile will be requested. If the selection was for a particular part of an interactive search result document, the results inherent in the visual query will be requested.
If the server system receives an annotation, the annotation is stored in a query and annotation database 116 explained in relation to figure 5. Then, the information from annotation database 116 is periodically copied to annotation databases. for one or more of the parallel server systems, as discussed below in relation to figures 7-10.
Figure 5 is a block diagram illustrating a client system 102 according to an embodiment of the present invention. Typically, client system 102 includes one or more processing units (CPUs) 702, one or more networks or other communication interfaces 704, memory 712 and one or more communication buses 714 to interconnect these components. The client system 102 includes a 705 user interface. The 705 user interface includes a 706 display device and optionally includes an input device, such as a keyboard, mouse or other 708 input buttons. Alternatively, or in addition to the moreover, the display device 706 includes a touch sensitive surface 709, in which case the screen 706/709 is a touch sensitive screen. On client systems that have a 706/709 touchscreen, a physical keyboard is optional (for example, a software keyboard can be displayed when keyboard input is required). In addition, some client systems use a microphone and speech recognition to complement or replace the keyboard. Optionally, client 102 includes a GPS receiver (global positioning satellite) or other 707 location detection device to determine the location of the client system 102. In some embodiments, visual query search services are provided that require the client system 102 provides the visual query server system to receive location information that indicates the location of the client system 102.
Client system 102 also includes an image capture device 710, such as a camera or digitizer. Memory 712 includes high speed random access memory, such as DRAM, SRAM, RAM DDR or other solid state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other solid state non-volatile storage devices. Memory 712 may optionally include one or more storage devices remotely located in relation to CPU (s) 702. Memory 712 or, alternatively, the non-volatile memory device (s) in memory 712 comprise non-temporary computer-readable storage media. In some embodiments, memory 712 or the computer readable storage media of memory 712 stores the following programs, modules and data structures, or a subset of these:
• a 716 operating system that includes procedures for handling various basic system services and for performing hardware-dependent tasks', • a 718 network communication module that is used to connect client system 102 to other computers via one or more network communication interfaces 704 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like;
• an image capture module 720 for processing the respective image captured by the device / image capture camera 710, where the respective image can be sent (for example, by a client application module) as a visual query to the system visual query server;
• one or more client application modules 722 to handle various aspects of image query, including, but not limited to: an image query submission module 724 for submitting visual queries to the visual query server system; optionally, a region of interest selection module 725 that detects a selection (such as a gesture on the touchscreen 706/709) of a region of interest in an image and prepares this region of interest as a visual query; a 726 result browser to display the results of the visual query; and, optionally, an annotation module 728 with optional modules for entering structured annotation text 730, such as filling in a form or for entering freeform annotation text 732, which can accept annotations from a variety of formats , and a module for selecting the region of the image 734 (sometimes referred to here as the result selection module) that allows a user to select a particular subpart of an image for annotation;
• an optional content authoring application (s) 736 that allows a user to author a visual query by creating or editing an image instead of just capturing one via the image capture device 710; optionally, one or more 736 applications can include instructions that enable a user to select a subpart of an image for use as a visual query;
• an optional 738 local image analysis module that preprocesses the visual query before sending it to the visual query server system. Local image analysis can recognize particular types of images, or sub-regions in an image. Examples of image types that can be recognized by such 738 modules include one or more of: facial type (facial image recognized in visual consultation), barcode type (barcode recognized in visual consultation) and text type (text recognized in the visual consultation); and • additional optional 740 client applications, such as an e-mail application, a telephone application, a browser application, a mapping application, instant messaging application, social networking application, etc. In some embodiments, the application corresponding to an appropriate actionable search result can be started or accessed when the actionable search result is selected.
Optionally, the region selection module of the image 734 that allows a user to select a particular subpart of an image for annotation also allows the user to choose a search result as a correct match without necessarily annotating it additionally. For example, the first N facial recognition matches can be presented to the user and the user can choose the correct person from this list of results. For some search queries, more than one type of result will be displayed and the user will choose one type of result. For example, the image query may include a person standing next to a tree, but only the results in relation to the person are of interest to the user. Therefore, the image selection module 734 allows the user to indicate which type of image is the correct type, that is, the type he is interested in receiving. The user may also wish to annotate the search result by adding personal comments or descriptive words using both the annotation text input module 730 (to fill out a form) and the free form annotation text input module 732.
In some embodiments, the optional local image analysis module 738 is a part of the client application (108, figure 1). In addition, in some modalities, the optional local image analysis module 738 includes one or more programs to perform local image analysis to pre-process or categorize the visual query or a part of it. For example, client application 722 may recognize that the image contains a barcode, a face or text, before submitting the visual query to a search engine. In some modalities, when the local image analysis module 738 detects that the visual query contains a particular type of image, the module asks the user if he is interested in a type of corresponding search result. For example, the local image analysis module 738 can detect a face based on its general characteristics (that is, without determining which face of the person) and provide immediate feedback to the user before sending the query to the visual query server system. It can resume a result such as, A face was detected, are you interested in receiving facial recognition correspondences for this face . This can save time for the visual query server system (106, figure 1). For some visual queries, the visual query processing server in the initial interface (110, figure 1) only sends the visual query to the search system 112 corresponding to the type of image recognized by the local image analysis module 738. In other modalities, the visual query on search engine 112 can send the visual query to all search engines 112A-N, but will rank results from search engine 112 corresponding to the type of image recognized by the local image analysis module 738. In some modalities, the way in which local image analysis impacts the operation of the visual query server system depends on the configuration of the client system, or the configuration or processing parameters associated with both the user and the client system. Furthermore, the actual content of any particular visual query and the results produced by the local image analysis can cause different visual queries to be treated differently both in the client system and in the visual query server system.
In some modalities, barcode recognition is performed in two stages, with the analysis of whether the visual query includes a barcode performed on the client system in the local image analysis module 738. Then, the visual query is passed to a barcode search system only if the customer determines that the visual query is likely to include a barcode. In other modalities, the barcode search system processes each visual query.
Optionally, client system 102 includes additional 740 client applications.
Figure 6 is a block diagram illustrating a visual query processing server system in initial interface 110 according to an embodiment of the present invention. Typically, the initial interface server 110 includes one or more processing units (CPUs) 802, one or more network interfaces or other communication interfaces 804, memory 812 and one or more communication buses 814 to interconnect these components. Memory 812 includes high speed random access memory, such as DRAM, SRAM, RAM DDR or other solid state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other non-volatile solid state storage devices. The 812 memory can optionally include one or more storage devices remotely located in relation to the 802 CPU (s). The 812 memory or, alternatively, the non-volatile memory device (s) (eis) in memory 812 comprise non-temporary computer-readable storage media. In some embodiments, memory 812 or the computer-readable storage medium of memory 812 stores the following programs, modules and data structures, or a subset of these:
• an 816 operating system that includes procedures for handling various basic system services and for performing hardware-dependent tasks', • an 818 network communication module that is used to connect the home interface server system 110 to other computers via one or more communication interfaces in network 804 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like;
• a query manager 820 to handle incoming visual queries from client system 102 and send them to two or more parallel search engines; as described elsewhere in this document, in some special situations, a visual query can be directed to only one of the search engines, such as when the visual query includes an instruction generated by a client (for example, only facial recognition search) ;
• a results filter module 822 to optionally filter results from one or more parallel search engines and send the main results or relevant results to client system 102 for presentation;
• an 824 results ranking and formatting module to optionally rank the results from one or more parallel search engines and to format the results for presentation;
• an 826 results document creation module is used, when appropriate, to create an interactive search results document; module 826 may include submodules, including, but not limited to, a containment box creation module 828 and a link creation module 830;
• an 831 label creation module to create labels that are visual identifiers for the respective subparts of a visual query;
• an 832 annotation module to receive annotations from a user and send them to an annotation database 116;
• an 838 actionable search results module to generate, in response to a visual query, one or more elements of the actionable search result, each configured to initiate a client-side action; examples of actionable search result elements are buttons to initiate a phone call, to initiate e-mail, to map an address, to make a restaurant reservation and to provide an option to purchase a product; and • a query and annotation database 116 comprising the database 834 itself and an index for the database 836.
The ranking and formatting results module 824 ranks the results retrieved from one or more parallel search engines (112-A - 112-N, figure 1). As explained, for some visual queries, only results from a search engine can be relevant. In a case like this, only the relevant search results from a search engine are ranked. For some visual queries, different types of search results may be relevant. In these cases, in some modalities, the results ranking and formatting module 824 ranks all results from the search engine with the most relevant result (for example, the result with the highest relevance score) above the results for the systems less relevant search engines. In other modalities, the ranking and formatting module of the results 824 ranks a main result from each relevant search system above the remaining results. In some modalities, the ranking and formatting module of the results 824 ranks the results according to a computed relevance score for each of the search results. For some visual queries, increased textual queries are performed in addition to the search in parallel visual search systems. In some modalities, when textual queries are also performed, their results are presented in a visually distinctive way in relation to the results of the visual search system.
The 824 results ranking and formatting module also formats the results. In some modalities, the results are presented in a list format. In some modalities, the results are presented through an interactive results document. In some modalities, both an interactive results document and a list of results are presented. In some modalities, the type of consultation indicates how the results are presented. For example, if more than one searchable subject is detected in the visual query, then an interactive results document is produced, although if only one searchable subject is detected, the results are displayed only in a list format.
The 826 results document creation module is used to create an interactive search results document. The interactive search results document can have one or more subjects detected and searched. The confinement box creation module 828 creates a confinement box around one or more of the searched subjects. The containment boxes can be rectangular boxes or they can outline the shape (s) of the subject (s). The link creation module 830 links the search results associated with their respective subject in the interactive search results document. In some modalities, clicking on the area of the containment box activates the corresponding link inserted by the link creation module.
The consultation and annotation database 116 contains information that can be used to improve visual consultation results. In some modalities, the user can annotate the image after the results of the visual consultation are presented. In addition, in some modalities, the user can annotate the image before sending it to the visual search engine. Pre-annotation can aid visual query processing by focusing on results, or by performing text-based searches on words annotated in parallel with visual query searches. In some modalities, annotated versions of a figure may become public (for example, when the user has given permission for publication, for example, by designating the image and the annotation (s) as non-private) to be taken up as a potential image match. For example, if a user takes a picture of a flower and annotates the image giving detailed information of genus and species about this flower, the user may want the image to be presented to anyone who performs a visual query search looking for this flower. In some modalities, information from the consultation and annotation database 116 is periodically transferred to parallel search systems 112, which incorporate relevant parts of the information (if any) in their respective individual databases 114.
Figure 7 is a block diagram that illustrates one of the parallel search systems used to process a visual query. Figure 7 illustrates a generic 112-N server system according to an embodiment of the present invention. This server system is generic only in that it represents any of the 112-N visual query search servers. Typically, the generic server system 112-N includes one or more processing units (CPUs) 502, one or more network interfaces or other communication interfaces 504, memory 512 and one or more communication buses 514 to interconnect these components. Memory 512 includes high-speed random access memory, such as DRAM, SRAM, RAM DDR or other solid-state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other solid state non-volatile storage devices. Memory 512 can optionally include one or more storage devices remotely located in relation to CPU (s) 502. Memory 512 or, alternatively, the non-volatile memory device (s) in memory 512 comprise non-temporary computer-readable storage media. In some embodiments, memory 512 or computer-readable storage media in memory 512 store the following programs, modules and data structures, or a subset of these:
• a 516 operating system that includes procedures for handling various basic system services and for performing hardware-dependent tasks', • a 518 network communication module that is used to connect the generic 112-N server system to other computers via one or more communication interfaces in a 504 network (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like;
• a search application 520 specific to the particular server system can be, for example, a barcode search application, a color recognition search application, a product recognition search application, a search application of object or category of object or the like;
• an optional 522 index if the particular search application uses an index;
• an optional image database 524 to store the images relevant to the particular search application, in which the stored image data, if any, depends on the type of search process;
• an optional 526 result ranking module (sometimes called a relevance score definition module) to rank the results of the search application, the ranking module being able to assign a relevance score to each result of the search application and , if no result reaches a pre-defined minimum score, you can return a score of zero or zero to the visual query processing server in the initial interface that indicates that the results from this server system are not relevant; and • an annotation module 528 for receiving annotation information from an annotation database (116, figure 1) that determines whether any of the annotation information is relevant to the particular search application and incorporates all relevant parts annotation information determined in the respective annotation database 530.
Figure 8 is a block diagram illustrating an OCR search system 112-B used to process a visual query according to an embodiment of the present invention. Typically, the OCR search engine 112-B includes one or more processing units (CPUs) 602, one or more network interfaces or other communication interfaces 604, memory 612 and one or more communication buses 614 to interconnect these components. Memory 612 includes high speed random access memory, such as DRAM, SRAM, RAM DDR or other solid state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other non-volatile solid state storage devices. Memory 612 may optionally include one or more storage devices remotely located in relation to CPU (s) 602. Memory 612 or, alternatively, non-volatile memory device (s) (eis) in memory 612 comprise non-temporary computer-readable storage media. In some embodiments, memory 612 or the computer-readable storage medium of memory 612 stores the following programs, modules and data structures, or a subset of these:
• a 616 operating system that includes procedures for handling various basic system services and for performing hardware-dependent tasks', • a 618 network communication module that is used to connect the OCR 112-B search appliance to other computers via one or more communication interfaces in a 604 network (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like;
• an Optical Character Recognition (OCR) 620 module that attempts to recognize text in the visual query and converts the letter images into characters;
• an optional OCR database 114-B that is used by the OCR 620 module to recognize fonts, text patterns and other features in particular unique for letter recognition;
• an optional spelling check module 622 that improves the conversion of letter images into characters by checking the words converted against a dictionary and replacing potentially badly converted letters into words that otherwise correspond to a word in the dictionary;
• an optional named entity recognition module 624 that searches for named entities in the converted text, sends named entities recognized as terms in a query by term to the query server system by term (118, figure 1), and provides the results from query server system by term as links incorporated into the text recognized by OCR associated with recognized named entities;
• an optional 632 text-matching application that improves the conversion of letter images into characters by checking converted segments (such as converted sentences and paragraphs) against a database of text segments and replacing potentially badly converted letters in text segments recognized by OCR that otherwise correspond to a text segment of the text matching application, in some embodiments, the text segment found by the text matching application is provided as a link to the user (for example, if the user has scanned a New York Times page, the text-matching application may provide a full link to the article posted on the New York Times website);
• a 626 results ranking and formatting module to format the results recognized by OCR for presentation and formatting of optional links to named entities and also optionally ranking all related results from the text matching application; and • an optional annotation module 628 for receiving annotation information from an annotation database (116, figure 1) that determines whether any of the annotation information is relevant to the OCR search engine and to incorporate all parts relevant annotation information determined in the respective annotation database 630.
Figure 9 is a block diagram illustrating a facial recognition search system 112-A used to process a visual query according to an embodiment of the present invention. Typically, the facial recognition search system 112-A includes one or more processing units (CPUs) 902, one or more network interfaces or other communication interfaces 904, memory 912 and one or more communication buses 914 to interconnect these components. Memory 912 includes high speed random access memory, such as DRAM, SRAM, RAM DDR or other solid state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other solid state non-volatile storage devices. Memory 912 may optionally include one or more storage devices remotely located in relation to CPU (s) 902. Memory 912 or, alternatively, non-volatile memory device (s) ) in memory 912 comprise non-temporary computer-readable storage media. In some embodiments, memory 912 or the computer-readable storage medium of memory 912 stores the following programs, modules and data structures, or a subset of these:
• a 916 operating system that includes procedures for handling various basic system services and for performing hardware-dependent tasks', • a 918 network communication module that is used to connect the 112-A facial recognition search system to other computers through one or more communication interfaces in a 904 network (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like;
• a 920 facial recognition search application to search for facial images that correspond to the face (s) presented in the visual consultation in a 114-A facial image database and to search the network database social 922, information regarding each correspondence found in the facial image database 114-A;
• a 114-A facial image database for storing one or more facial images for a plurality of users; optionally, the facial image database includes facial images for people other than users, such as family members and others known to users, who have been identified as present in the images included in the facial image database 114-A; optionally, the facial image database includes facial images obtained from external sources, such as resellers of facial images that are legally in the public domain;
• optionally, a 922 social network database that contains information regarding social network users, such as name, address, occupation, group membership, social network connections, current GPS location of the mobile device, sharing preferences, interests, age, hometown, personal statistics, work information, etc., as discussed in more detail in relation to figure 12A;
• a 924 results ranking and formatting module to rank (for example, assign a relevance and / or match quality score) potential facial matches from the facial image database 114-A and format the results for presentation; in some modalities, the ranking or definition of the score of the results uses related information retrieved from the aforementioned social network database; in some modalities, the formatted search results include potential image matches as well as a subset of information from the social network database; and • a 926 annotation module for receiving annotation information from an annotation database (116, figure 1) that determines whether any of the annotation information is relevant to the facial recognition search system and for storing all information. relevant parts of the annotation information determined in the respective annotation database 928.
Figure 10 is a block diagram illustrating an image search system using terms 112-C used to process a visual query according to an embodiment of the present invention. In some modalities, the term image search system recognizes objects (instance recognition) in the visual query. In other modalities, the image search system for terms recognizes object categories (type recognition) in the visual query. In some modalities, the term image system recognizes both objects and object categories. The term image search system resumes potential term matches for images in the visual query. Typically, the 112-C term image search system includes one or more processing units (CPUs) 1002, one or more network interfaces or other communication interfaces 1004, memory 1012 and one or more communication buses 1014 to interconnect these components. Memory 1012 includes high speed random access memory, such as DRAM, SRAM, RAM DDR or other solid state random access memory devices; and may include non-volatile memory, such as one or more magnetic disk storage devices, optical disk storage devices, flash memory devices or other solid state non-volatile storage devices. Memory 1012 may optionally include one or more storage devices remotely located in relation to CPU (s) 1002. Memory 1012 or, alternatively, the non-volatile memory device (s) in memory 1012 comprise non-temporary computer-readable storage media. In some embodiments, memory 1012 or computer-readable storage media in memory 1012 stores the following programs, modules and data structures, or a subset of these:
• a 1016 operating system that includes procedures for handling various basic system services and for performing hardware dependent tasks', • a 1018 network communication module that is used to connect the 112-C term search engine to others computers through one or more communication interfaces in a 1004 network (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks and the like ;
• an image search application using 1020 terms that searches for images that match the subject or subjects in the visual query in the image search database 114-C;
• a 114-C image search database that can be searched by the search application 1020 to find images similar to the subject (s) of the visual query;
• an inverse index of terms per image 1022 that stores the textual terms used by users when searching for images using a 1006 text-based query search engine;
• a 1024 results ranking and formatting module to rank potential image matches and / or rank terms associated with potential image matches identified in the inverse index of terms per image 1022; and • an annotation module 1026 for receiving annotation information from an annotation database (116, figure 1) that determines whether any of the annotation information is relevant to the image search system by terms 112-C and to store all relevant parts of the annotation information determined in the respective annotation database 1028.
Figures 5-10 are intended to be understood more as functional descriptions of the various resources that may be present in a set of computer systems than as a schematic structural representation of the modalities described here. In practice, and as recognized by those skilled in the art, items shown separately can be combined and some items can be separated. For example, some items shown separately in these figures can be implemented on individual servers and individual items can be implemented by one or more servers. The actual number of systems used to implement visual query processing and how resources are allocated between them will vary from one implementation to another.
Each of the methods described here can be managed by instructions that are stored on non-temporary, computer-readable storage media and that are executed by one or more processors from one or more servers or clients. The above-identified modules or programs (ie, instruction sets) do not need to be implemented as separate software programs, procedures or modules, and thus several subsets of these modules can be combined or otherwise rearranged in various modalities. Each of the operations shown in figures 5-10 can correspond to instructions stored in a non-temporary, computer-readable computer memory or storage media.
Figure 11 illustrates a client system 102 with a screen capture of an exemplary visual query 1102. Client system 102 shown in figure 11 is a mobile device, such as a cell phone, portable music player or portable e-mail device. The client system 102 includes a screen 706 and one or more input devices 708, such as the buttons shown in this figure. In some modalities, the 706 screen is a 709 touch screen. In modalities with a 709 touch screen, optionally, software buttons displayed on the 709 screen can replace some or all of the 708 electromechanical buttons. The touch screens also are useful in interacting with the results of the visual consultation, as explained in more detail below. Client system 102 also includes an image capture mechanism, such as a 710 camera.
Figure 11 illustrates a visual query 1102 which is a photograph or video frame of a package on a store shelf. In the modalities described here, the visual query is a two-dimensional image with a resolution corresponding to the size of the visual query in pixels in each of the two dimensions. In this example, visual query 1102 is a two-dimensional image of three-dimensional objects. Visual query 1102 includes background elements, a product package 1104 and a variety of types of entities in the package, including an image of a person 1106, an image of a trademark 1108, an image of a product 1110 and a variety of textual elements 1112.
As explained in relation to figure 3, the visual query 1102 is sent to the initial interface server 110, which sends the visual query 1102 to a plurality of parallel search engines (112A-N), receives the results and creates a document of interactive results.
Figures 12A and 12B each illustrate a client system 102 with a screen capture of an embodiment of an interactive results document 1200. Interactive results document 1200 includes one or more visual identifiers 1202 from respective subparts of visual query 1102 that, each includes a user-selectable link to a subset of search results. Figures 12A and 12B illustrate an interactive results document 1200 with visual identifiers that are confinement boxes 1202 (for example, confinement boxes 1202-1, 1202-2, 1202-3). In the modalities shown in figures 12A and 12B, the user activates the display of the search results corresponding to a particular subpart by tapping the activation region inside the space outlined by its 1202 confinement box. For example, the user will activate the search results corresponding to the person's image by tapping a 1306 confinement box (figure 13) surrounding the person's image. In other embodiments, the selectable call is selected using a mouse or keyboard instead of a touchscreen. In some modalities, the first corresponding search result is displayed when a user previews a 1202 confinement box (that is, when the user clicks, pats or hovers over the confinement box). The user activates the display of a plurality of corresponding search results when the user selects the containment box (that is, when the user double-clicks, pats or uses another mechanism to indicate the selection).
In figures 12A and 12B the visual identifiers are confinement boxes 1202 that surround subparts of the visual query. Figure 12A illustrates containment boxes 1202 that are square or rectangular. Figure 12B illustrates a confinement box 1202 that outlines the outline of an identifiable entity in the subpart of the visual query, such as confinement box 1202-3 for a beverage bottle. In some embodiments, a respective containment box 1202 includes smaller containment boxes 1202 itself. For example, in figures 12A and 12B, the confinement box that identifies the 1202-1 package surrounds the confinement box that identifies the trademark 1202-2 and all other confinement boxes 1202. Some modalities that include text also include links active quick 1204 for some of the textual terms. Figure 12B shows an example where Active Drink and the United States are displayed as quick links 1204. The search results corresponding to these terms are the results received from the query server system by term 118, while the results corresponding to the boxes confinement results of the consultation by image search systems.
Figure 13 illustrates a client system 102 with a screen capture of an interactive results document 1200 that is encoded by the type of entity recognized in the visual query. The visual query in figure 11 contains an image of a person 1106, an image of a trademark 1108, an image of a product 1110 and a variety of textual elements 1112. As such, the interactive results document 1200 shown in figure 13 includes containment boxes 1202 around a person 1306, a trademark 1308, a product 1310 and the two text areas 1312. Each of the containment boxes in figure 13 is presented with a separate hatch that represents differently colored transparent confinement boxes 1202. In some embodiments, the visual identifiers of the containment boxes (and / or labels or other visual identifiers in the interactive results document 1200) are formatted for presentation in visually distinctive ways, such as overlay color, overlay pattern, label background, label background pattern, label font color and border color of the containment box. Type coding for particular recognized entities is shown in relation to the containment boxes in figure 13, but type coding can also be applied to visual identifiers that are labels.
Figure 14 illustrates a client device 102 with a screen capture of an interactive results document 1200 with labels 1402 being the visual identifiers of the respective subparts of visual query 1102 of figure 11. Each of the visual identifiers for label 1402 includes a selectable link by the user to a subset of corresponding search results. In some modalities, the selectable link is identified by descriptive text displayed in the 1402 label area. Some modalities include a plurality of links on a 1402 label. For example, in figure 14, the label that hangs over the image of a drinking woman includes a link to facial recognition results for women and a link to image recognition results for this particular figure (for example, images from other products or advertisements using the same figure).
In figure 14, labels 1402 are displayed as partially transparent areas with text that are located on their respective subparts of the interactive results document. In other modalities, a respective label is placed close to, but not located on, the respective subpart of the interactive results document. In some modalities, the labels are coded by type, in the same way discussed in relation to figure 13. In some modalities, the user activates the display of the search results corresponding to a particular subpart corresponding to a 1302 label by tapping the region of activation within the space outlined by the edges or on the periphery of the 1302 label. The same preview and selection functions discussed above in relation to the containment boxes in Figures 12A and 12B also apply to the visual identifiers that are 1402 labels.
Figure 15 illustrates a screen capture of an interactive results document 1200 and the original visual query 1102 displayed concurrently with a list of results 1500. In some embodiments, the interactive results document 1200 is displayed by itself, as shown in figures 12 - 14. In other modalities, the interactive results document 1200 is concurrently displayed with the original visual query, as shown in figure 15. In some modalities, the visual query results list 1500 is concurrently displayed, along with the original visual query 1102 and / or with interactive results document 1200. The type of client system and the amount of space on screen 706 can determine whether the result list 1500 is displayed concurrently with the interactive results document 1200. In some modalities , client system 102 receives (in response to a visual query submitted to the visual query server system) both the list Results 1500 as well as Interactive Results Document 1200, but only displays the Results List 1500 when the user scrolls the Interactive Results Document 1200 down. In some of these modalities, the client system 102 displays the results corresponding to a visual identifier selected by the user 1202/1402 without having to consult the server again because the list of results 1500 is received by the client system 102 in response to the visual query and, then, stored locally on the client system 102.
In some modalities, the list of results 1500 is organized into categories 1502. Each category contains at least one result 1503. In some modalities, category headings are highlighted to distinguish them from results 1503. Categories 1502 are ordered according to its calculated category weight. In some modalities, the category weight is a combination of the weights of the N highest scores in that category. As such, the category that has probably produced the most relevant results is displayed first. In modalities where more than one category 1502 is taken up for the same recognized entity (such as the facial image recognition match and the image match shown in figure 15), the category displayed first has a higher category weight.
As explained in relation to figure 3, in some modalities, when a selectable link in the interactive results document 1200 is selected by a user of the client system 102, the cursor will automatically move to the appropriate category 1502 or to the first result 1503 in that category. Alternatively, when a selectable link in the interactive results document is selected by a user of the client system 102, the result list 1500 is reordered in such a way that the category or categories relevant to the selected link are displayed first. This is done, for example, either by coding the selectable links with information that identifies the corresponding search results or by coding the search results to indicate the corresponding selectable links or to indicate the corresponding categories of the result.
In some modalities, the categories of the search results correspond to the image query search system that produces these search results. For example, in figure 15, some of the categories are product matching 1506, logo matching 1508, facial recognition matching 1510, image matching 1512. The original visual query 1102 and / or an interactive results document 1200 can be similarly displayed with a category title, such as query 1504. Similarly, results from any term search performed by the term query server can also be displayed as a separate category, such as Internet results 1514. In other modalities, more than an entity in a visual query will produce results from the same image query search system. For example, the visual query can include two different faces that will return separate results from the facial recognition search system. As such, in some modalities, categories 1502 are divided by recognized entity rather than by the search engine. In some modalities, an image of the recognized entity is displayed in the category header of the recognized entity 1502, in such a way that the results for this recognized entity are distinguishable from the results for another recognized entity, even though both results are produced by the same query by the image search system. For example, in figure 15, the product matching category 1506 includes two product entities and, as such, two entity categories 1502, a boxed product 1516 and a bottled product 1518, each having a plurality of corresponding search results 1503. In some modalities, the categories can be divided by recognized entities and type of image consultation system. For example, in figure 15, there are two separate entities that have returned relevant results under the product match category.
In some embodiments, the 1503 results include thumbnail images. For example, as shown for the results of the facial recognition match in figure 15, short versions (also called thumbnail images) of the facial match figures for Actress X and Friend of Social Network Y are displayed along with some textual description, such as the name of the person in the image.
The exposed description, for the purpose of explanation, was given in relation to the specific modalities. However, it is not intended that the illustrative discussions exposed are complete or limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the exposed precepts. The modalities were chosen and described in order to better explain the principles of the invention and their practical applications 10, in order to enable others skilled in the art to better use the invention and various modalities with various modifications suitable for the particular use contemplated.

权利要求:
Claims (3)
[1]
1. Method of processing a visual query implemented by computer, characterized by the fact that it comprises:
on a server system with one or more processors and memory that stores instructions for execution by one or more processors:
obtain, from a client system, a visual query having two or more objects including:
(i) a first object having a first type of object; and (ii) a second object having a second type of object distinct from the first type of object, the first and second types of object are selected from the group consisting of:
OCR characters, a person's face, a non-human object and a bar code;
partition the visual query into two or more regions including a first region and a second region, where the first region includes the first object, and the second region includes the second object;
process the visual query by simultaneously obtaining visual query search results including:
(i) a first set of results obtained in accordance with the first object;
(ii) a second set of results obtained in accordance with the second object;
format for simultaneous display (i) the first set of results, and (ii) the second set of results for a user;
get one or more user notes from a particular search result in the first set of search results or the second
Petition 870190105962, of 10/18/2019, p. 16/20 set of search results, in which one or more annotations by the user indicate an action taken by a user that indicates the respective relevance of the search results, or lack thereof, for visual consultation;
obtain a second visual consultation; and in response to obtaining the second visual query, obtaining a second plurality of search results based on at least one annotation in one or more annotations of the user.
[2]
2. Method implemented by computer, according to claim 1, characterized by the fact that the visual consultation is selected from the group consisting of: a photograph, a screenshot, a digitized image, a video frame, and a plurality of video frames.
[3]
3. Search engine system for processing a visual query, characterized by the fact that it comprises:
one or more central processing units to execute instructions;
memory that stores instructions to be executed by one or more central processing units;
instructions by performing the following steps:
obtain, from a client system, a visual query having two or more objects including:
(i) a first object having a first type of object; and (ii) a second object having a second type of object distinct from the first type of object, the first and second types of object are selected from the group consisting of:
OCR characters, a person's face, a non-human object and a bar code;
partition the visual query into two or more regions

类似技术:

公开号 | 公开日 | 专利标题

US20190012334A1|2019-01-10|Architecture for Responding to Visual Query

US9087059B2|2015-07-21|User interface for presenting search results for multiple regions of a visual query

US9183224B2|2015-11-10|Identifying matching canonical documents in response to a visual query

BR112012002823B1|2021-06-22|COMPUTER IMPLEMENTED METHOD OF PROCESSING A VISUAL QUERY, SERVER SYSTEM, AND, COMPUTER-READABLE NON-TEMPORARY STORAGE MEDIA

CA2770186C|2018-05-22|User interface for presenting search results for multiple regions of a visual query

CA2781845C|2016-09-13|Actionable search results for visual queries

EP2646949B1|2018-10-03|Identifying matching source documents in response to a visual query

US20110128288A1|2011-06-02|Region of Interest Selector for Visual Queries

AU2016200659B2|2017-06-22|Architecture for responding to a visual query

同族专利:

公开号 | 公开日

JP6148367B2|2017-06-14|

WO2011017557A1|2011-02-10|

KR20160092045A|2016-08-03|

US10534808B2|2020-01-14|

CN102625937B|2014-02-12|

BR112012002815B8|2020-10-06|

US20190012334A1|2019-01-10|

US20110125735A1|2011-05-26|

KR101725885B1|2017-04-11|

EP2462520B1|2014-07-02|

JP5933677B2|2016-06-15|

EP2462520A1|2012-06-13|

KR101667346B1|2016-10-18|

CA2771094C|2020-03-24|

AU2013205924B2|2015-12-24|

KR20120058538A|2012-06-07|

CA3068761A1|2011-02-10|

CA2771094A1|2011-02-10|

JP2015064901A|2015-04-09|

AU2010279333A1|2012-03-15|

JP2016139424A|2016-08-04|

CN102625937A|2012-08-01|

US9135277B2|2015-09-15|

AU2013205924A1|2013-06-06|

US20140164406A1|2014-06-12|

JP2013501975A|2013-01-17|

AU2010279333B2|2013-02-21|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US4888690A|1985-01-11|1989-12-19|Wang Laboratories, Inc.|Interactive error handling means in database management|

US4899292A|1988-03-02|1990-02-06|Image Storage/Retrieval Systems, Inc.|System for storing and retrieving text and associated graphics|

CA2048306A1|1990-10-02|1992-04-03|Steven P. Miller|Distributed configuration profile for computing system|

US5649183A|1992-12-08|1997-07-15|Microsoft Corporation|Method for compressing full text indexes with document identifiers and location offsets|

US5574898A|1993-01-08|1996-11-12|Atria Software, Inc.|Dynamic software version auditor which monitors a process to provide a list of objects that are accessed|

US5544051A|1993-09-17|1996-08-06|Digital Equipment Corporation|Document management system using multiple threaded processes and having asynchronous repository responses and no busy cursor|

JP2813728B2|1993-11-01|1998-10-22|インターナショナル・ビジネス・マシーンズ・コーポレイション|Personal communication device with zoom / pan function|

US5560005A|1994-02-25|1996-09-24|Actamed Corp.|Methods and systems for object-based relational distributed databases|

US6216138B1|1994-04-22|2001-04-10|Brooks Automation Inc.|Computer interface system for automatically generating graphical representations of computer operations linked together according to functional relationships|

US6029195A|1994-11-29|2000-02-22|Herz; Frederick S. M.|System for customized electronic identification of desirable objects|

US5764799A|1995-06-26|1998-06-09|Research Foundation Of State Of State Of New York|OCR method and apparatus using image equivalents|

WO1997008604A2|1995-08-16|1997-03-06|Syracuse University|Multilingual document retrieval system and method using semantic vector matching|

US5963940A|1995-08-16|1999-10-05|Syracuse University|Natural language information retrieval system and method|

US6026388A|1995-08-16|2000-02-15|Textwise, Llc|User interface and other enhancements for natural language information retrieval system and method|

US5815415A|1996-01-19|1998-09-29|Bentley Systems, Incorporated|Computer system for portable persistent modeling|

US6076088A|1996-02-09|2000-06-13|Paik; Woojin|Information extraction system and method using concept relation concept triples|

US5778378A|1996-04-30|1998-07-07|International Business Machines Corporation|Object oriented information retrieval framework mechanism|

US6014661A|1996-05-06|2000-01-11|Ivee Development Ab|System and method for automatic analysis of data bases and for user-controlled dynamic querying|

US6101515A|1996-05-31|2000-08-08|Oracle Corporation|Learning system for classification of terminology|

US5870739A|1996-09-20|1999-02-09|Novell, Inc.|Hybrid query apparatus and method|

JP3099756B2|1996-10-31|2000-10-16|富士ゼロックス株式会社|Document processing device, word extraction device, and word extraction method|

US6480194B1|1996-11-12|2002-11-12|Silicon Graphics, Inc.|Computer-related method, system, and program product for controlling data visualization in external dimension|

US5966126A|1996-12-23|1999-10-12|Szabo; Andrew J.|Graphic user interface for database system|

US5946692A|1997-05-08|1999-08-31|At & T Corp|Compressed representation of a data base that permits AD HOC querying|

CA2242158C|1997-07-01|2004-06-01|Hitachi, Ltd.|Method and apparatus for searching and displaying structured document|

US5987448A|1997-07-25|1999-11-16|Claritech Corporation|Methodology for displaying search results using character recognition|

US6188403B1|1997-11-21|2001-02-13|Portola Dimensional Systems, Inc.|User-friendly graphics generator using direct manipulation|

US6105030A|1998-02-27|2000-08-15|Oracle Corporation|Method and apparatus for copying data that resides in a database|

US6173287B1|1998-03-11|2001-01-09|Digital Equipment Corporation|Technique for ranking multimedia annotations of interest|

US6269188B1|1998-03-12|2001-07-31|Canon Kabushiki Kaisha|Word grouping accuracy value generation|

US6327574B1|1998-07-07|2001-12-04|Encirq Corporation|Hierarchical models of consumer attributes for targeting content in a privacy-preserving manner|

US6137907A|1998-09-23|2000-10-24|Xerox Corporation|Method and apparatus for pixel-level override of halftone detection within classification blocks to reduce rectangular artifacts|

US6529900B1|1999-01-14|2003-03-04|International Business Machines Corporation|Method and apparatus for data visualization|

US6377943B1|1999-01-20|2002-04-23|Oracle Corp.|Initial ordering of tables for database queries|

GB9903451D0|1999-02-16|1999-04-07|Hewlett Packard Co|Similarity searching for documents|

US6584464B1|1999-03-19|2003-06-24|Ask Jeeves, Inc.|Grammar template query system|

US6263328B1|1999-04-09|2001-07-17|International Business Machines Corporation|Object oriented query model and process for complex heterogeneous database queries|

US20030195872A1|1999-04-12|2003-10-16|Paul Senn|Web-based information content analyzer and information dimension dictionary|

US6304864B1|1999-04-20|2001-10-16|Textwise Llc|System for retrieving multimedia information from the internet using multiple evolving intelligent agents|

US6629097B1|1999-04-28|2003-09-30|Douglas K. Keith|Displaying implicit associations among items in loosely-structured data sets|

JP2000331006A|1999-05-18|2000-11-30|Nippon Telegr & Teleph Corp <Ntt>|Information retrieval device|

US6721713B1|1999-05-27|2004-04-13|Andersen Consulting Llp|Business alliance identification in a web architecture framework|

EP1058236B1|1999-05-31|2007-03-07|Nippon Telegraph and Telephone Corporation|Speech recognition based database query system|

US6408293B1|1999-06-09|2002-06-18|International Business Machines Corporation|Interactive framework for understanding user's perception of multimedia data|

US6873982B1|1999-07-16|2005-03-29|International Business Machines Corporation|Ordering of database search results based on user feedback|

US6341306B1|1999-08-13|2002-01-22|Atomica Corporation|Web-based information retrieval responsive to displayed word identified by a text-grabbing algorithm|

US6498921B1|1999-09-01|2002-12-24|Chi Fai Ho|Method and system to answer a natural-language question|

CA2281331A1|1999-09-03|2001-03-03|Cognos Incorporated|Database management system|

JP4770875B2|1999-09-27|2011-09-14|三菱電機株式会社|Image feature data generation device, image feature determination device, and image search system|

US6105020A|1999-10-11|2000-08-15|International Business Machines Corporation|System and method for identifying and constructing star joins for execution by bitmap ANDing|

US6850896B1|1999-10-28|2005-02-01|Market-Touch Corporation|Method and system for managing and providing sales data using world wide web|

US6546388B1|2000-01-14|2003-04-08|International Business Machines Corporation|Metadata search results ranking system|

US6606659B1|2000-01-28|2003-08-12|Websense, Inc.|System and method for controlling access to internet sites|

US20030120659A1|2000-03-20|2003-06-26|Sridhar Mandayam Anandampillai|Systems for developing websites and methods therefor|

US6643641B1|2000-04-27|2003-11-04|Russell Snyder|Web search engine with graphic snapshots|

US7325201B2|2000-05-18|2008-01-29|Endeca Technologies, Inc.|System and method for manipulating content in a hierarchical data-driven search and navigation system|

US7401131B2|2000-05-22|2008-07-15|Verizon Business Global Llc|Method and system for implementing improved containers in a global ecosystem of interrelated services|

US6754677B1|2000-05-30|2004-06-22|Outlooksoft Corporation|Method and system for facilitating information exchange|

US7640489B2|2000-08-01|2009-12-29|Sun Microsystems, Inc.|Methods and systems for inputting data into spreadsheet documents|

US7100083B2|2000-08-04|2006-08-29|Sun Microsystems, Inc.|Checks for product knowledge management|

WO2002017166A2|2000-08-24|2002-02-28|Olive Software Inc.|System and method for automatic preparation and searching of scanned documents|

US20030217052A1|2000-08-24|2003-11-20|Celebros Ltd.|Search engine method and apparatus|

AU9059701A|2000-09-01|2002-03-13|Togethersoft Corp|Methods and systems for optimizing resource allocation based on data mined from plans created from a workflow|

US6823084B2|2000-09-22|2004-11-23|Sri International|Method and apparatus for portably recognizing text in an image sequence of scene imagery|

US6832218B1|2000-09-22|2004-12-14|International Business Machines Corporation|System and method for associating search results|

US20020065815A1|2000-10-04|2002-05-30|Xcelerix, Inc.|Systems and methods for searching a database|

US7016532B2|2000-11-06|2006-03-21|Evryx Technologies|Image capture and identification system and process|

US7925967B2|2000-11-21|2011-04-12|Aol Inc.|Metadata quality improvement|

US7013308B1|2000-11-28|2006-03-14|Semscript Ltd.|Knowledge storage and retrieval system and method|

US6781599B2|2001-01-04|2004-08-24|At&T|System and method for visualizing massive multi-digraphs|

JP2002223105A|2001-01-26|2002-08-09|Sanyo Electric Co Ltd|Coaxial resonator, and dielectric filter and dielectric duplexer employing it|

JP4077608B2|2001-02-27|2008-04-16|株式会社エヌ・ティ・ティ・ドコモ|Feature region extraction method and apparatus, and information providing method and apparatus|

US6748398B2|2001-03-30|2004-06-08|Microsoft Corporation|Relevance maximizing, iteration minimizing, relevance-feedback, content-based image retrieval |

US6920477B2|2001-04-06|2005-07-19|President And Fellows Of Harvard College|Distributed, compressed Bloom filter Web cache server|

US7031955B1|2001-04-27|2006-04-18|I2 Technologies Us, Inc.|Optimization using a multi-dimensional data model|

US6961723B2|2001-05-04|2005-11-01|Sun Microsystems, Inc.|System and method for determining relevancy of query responses in a distributed network search mechanism|

US7398201B2|2001-08-14|2008-07-08|Evri Inc.|Method and system for enhanced data searching|

US7403938B2|2001-09-24|2008-07-22|Iac Search & Media, Inc.|Natural language query processing|

US7313617B2|2001-09-28|2007-12-25|Dale Malik|Methods and systems for a communications and information resource manager|

JP2003150617A|2001-11-12|2003-05-23|Olympus Optical Co Ltd|Image processor and program|

US6826572B2|2001-11-13|2004-11-30|Overture Services, Inc.|System and method allowing advertisers to manage search listings in a pay for placement search system using grouping|

US7328349B2|2001-12-14|2008-02-05|Bbn Technologies Corp.|Hash-based systems and methods for detecting, preventing, and tracing network worms and viruses|

JP3931214B2|2001-12-17|2007-06-13|日本アイ・ビー・エム株式会社|Data analysis apparatus and program|

US6988018B2|2001-12-26|2006-01-17|Eames John D|System and method for analyzing controlling forming sections of a paper machine in operation|

US20030154071A1|2002-02-11|2003-08-14|Shreve Gregory M.|Process for the document management and computer-assisted translation of documents utilizing document corpora constructed by intelligent agents|

US7343365B2|2002-02-20|2008-03-11|Microsoft Corporation|Computer system architecture for automatic context associations|

US6928436B2|2002-02-28|2005-08-09|Ilog Sa|Interactive generation of graphical visualizations of large data structures|

US7043521B2|2002-03-21|2006-05-09|Rockwell Electronic Commerce Technologies, Llc|Search agent for searching the internet|

US20040030731A1|2002-04-03|2004-02-12|Liviu Iftode|System and method for accessing files in a network|

US20030208665A1|2002-05-01|2003-11-06|Jih-Kwon Peir|Reducing data speculation penalty with early cache hit/miss prediction|

US7158983B2|2002-09-23|2007-01-02|Battelle Memorial Institute|Text analysis technique|

DE10245900A1|2002-09-30|2004-04-08|Neven jun., Hartmut, Prof.Dr.|Image based query system for search engines or databases of mobile telephone, portable computer uses image recognition to access more information about objects in image|

US20040167908A1|2002-12-06|2004-08-26|Attensity Corporation|Integration of structured data with free text for data mining|

US7181450B2|2002-12-18|2007-02-20|International Business Machines Corporation|Method, system, and program for use of metadata to create multidimensional cubes in a relational database|

US7278111B2|2002-12-26|2007-10-02|Yahoo! Inc.|Systems and methods for selecting a date or range of dates|

US7472110B2|2003-01-29|2008-12-30|Microsoft Corporation|System and method for employing social networks for information discovery|

US7146538B2|2003-03-28|2006-12-05|Hewlett-Packard Development Company, L.P.|Bus interface module|

US7111025B2|2003-04-30|2006-09-19|International Business Machines Corporation|Information retrieval system and method using index ANDing for improving performance|

US7853508B2|2003-05-19|2010-12-14|Serena Software, Inc.|Method and system for object-oriented management of multi-dimensional data|

US7926103B2|2003-06-05|2011-04-12|Hewlett-Packard Development Company, L.P.|System and method for preventing replay attacks|

US7836391B2|2003-06-10|2010-11-16|Google Inc.|Document search engine including highlighting of confident results|

US8321470B2|2003-06-20|2012-11-27|International Business Machines Corporation|Heterogeneous multi-level extendable indexing for general purpose annotation systems|

US9026901B2|2003-06-20|2015-05-05|International Business Machines Corporation|Viewing annotations across multiple applications|

US7162473B2|2003-06-26|2007-01-09|Microsoft Corporation|Method and system for usage analyzer that determines user accessed sources, indexes data subsets, and associated metadata, processing implicit queries based on potential interest to users|

US7274822B2|2003-06-30|2007-09-25|Microsoft Corporation|Face annotation for photo management|

US7565425B2|2003-07-02|2009-07-21|Amazon Technologies, Inc.|Server architecture and methods for persistently storing and serving event data|

US7814093B2|2003-07-25|2010-10-12|Microsoft Corporation|Method and system for building a report for execution against a data store|

US7444515B2|2003-08-14|2008-10-28|Washington University|Method and apparatus for detecting predefined signatures in packet payload using Bloom filters|

US7174328B2|2003-09-02|2007-02-06|International Business Machines Corp.|Selective path signatures for query processing over a hierarchical tagged data structure|

US7409406B2|2003-09-08|2008-08-05|International Business Machines Corporation|Uniform search system and method for selectively sharing distributed access-controlled documents|

US20050057566A1|2003-09-11|2005-03-17|International Business Machines Corporation|Rich graphic visualization generation from abstract data representation|

US7236982B2|2003-09-15|2007-06-26|Pic Web Services, Inc.|Computer systems and methods for platform independent presentation design|

US7496560B2|2003-09-23|2009-02-24|Amazon Technologies, Inc.|Personalized searchable library with highlighting capabilities|

JP2005107978A|2003-09-30|2005-04-21|Nec Corp|Information retrieving device using information terminal with photographing function and information retrieving method|

US7370034B2|2003-10-15|2008-05-06|Xerox Corporation|System and method for performing electronic information retrieval using keywords|

US7620624B2|2003-10-17|2009-11-17|Yahoo! Inc.|Systems and methods for indexing content for fast and scalable retrieval|

US20050083413A1|2003-10-20|2005-04-21|Logicalis|Method, system, apparatus, and machine-readable medium for use in connection with a server that uses images or audio for initiating remote function calls|

US7415456B2|2003-10-30|2008-08-19|Lucent Technologies Inc.|Network support for caller identification based on biometric measurement|

JP2005165461A|2003-11-28|2005-06-23|Nifty Corp|Information providing device and program|

US7872669B2|2004-01-22|2011-01-18|Massachusetts Institute Of Technology|Photo-based mobile deixis system and related techniques|

JP4413633B2|2004-01-29|2010-02-10|株式会社ゼータ・ブリッジ|Information search system, information search method, information search device, information search program, image recognition device, image recognition method and image recognition program, and sales system|

US20050187898A1|2004-02-05|2005-08-25|Nec Laboratories America, Inc.|Data Lookup architecture|

US7707039B2|2004-02-15|2010-04-27|Exbiblio B.V.|Automatic modification of web pages|

US7751805B2|2004-02-20|2010-07-06|Google Inc.|Mobile image-based information retrieval system|

US7451185B2|2004-02-27|2008-11-11|Fotomedia Technologies, Llc|Method and system for providing links to resources related to a specified resource|

US20050216464A1|2004-03-27|2005-09-29|Microsoft Corporation|Automated authoring tool and method to facilitate inclusion of maps and other geographical data into travelogues|

US20050219929A1|2004-03-30|2005-10-06|Navas Julio C|Method and apparatus achieving memory and transmission overhead reductions in a content routing network|

WO2005114476A1|2004-05-13|2005-12-01|Nevengineering, Inc.|Mobile image-based information retrieval system|

US20050268212A1|2004-05-28|2005-12-01|Michael Dagel|System, apparatus, and method for desktop-based creation and publication of a periodic community newsletter|

US7685112B2|2004-06-17|2010-03-23|The Regents Of The University Of California|Method and apparatus for retrieving and indexing hidden pages|

US8051207B2|2004-06-25|2011-11-01|Citrix Systems, Inc.|Inferring server state in s stateless communication protocol|

US7493335B2|2004-07-02|2009-02-17|Graphlogic Inc.|Object process graph relational database interface|

US20060020582A1|2004-07-22|2006-01-26|International Business Machines Corporation|Method and system for processing abstract derived entities defined in a data abstraction model|

US20060020630A1|2004-07-23|2006-01-26|Stager Reed R|Facial database methods and systems|

US7890871B2|2004-08-26|2011-02-15|Redlands Technology, Llc|System and method for dynamically generating, maintaining, and growing an online social network|

JP2006085379A|2004-09-15|2006-03-30|Canon Inc|Information processor and its control method, and program|

US8489583B2|2004-10-01|2013-07-16|Ricoh Company, Ltd.|Techniques for retrieving documents using an image capture device|

US9176984B2|2006-07-31|2015-11-03|Ricoh Co., Ltd|Mixed media reality retrieval of differentially-weighted links|

US7809763B2|2004-10-15|2010-10-05|Oracle International Corporation|Method for updating database object metadata|

US20060085386A1|2004-10-19|2006-04-20|Microsoft Corporation|Two pass calculation to optimize formula calculations for a spreadsheet|

WO2006043319A1|2004-10-20|2006-04-27|Fujitsu Limited|Terminal and server|

US8320641B2|2004-10-28|2012-11-27|DigitalOptics Corporation Europe Limited|Method and apparatus for red-eye detection using preview or other reference images|

US20060149700A1|2004-11-11|2006-07-06|Gladish Randolph J|System and method for automatic geospatial web network generation via metadata transformation|

US20060150119A1|2004-12-31|2006-07-06|France Telecom|Method for interacting with automated information agents using conversational queries|

WO2006070047A1|2004-12-31|2006-07-06|Nokia Corporation|Provision of target specific information|

JP4282612B2|2005-01-19|2009-06-24|エルピーダメモリ株式会社|Memory device and refresh method thereof|

US20060173824A1|2005-02-01|2006-08-03|Metalincs Corporation|Electronic communication analysis and visualization|

JP4267584B2|2005-02-28|2009-05-27|株式会社東芝|Device control apparatus and method|

US7917299B2|2005-03-03|2011-03-29|Washington University|Method and apparatus for performing similarity searching on a data stream with respect to a query string|

US7587387B2|2005-03-31|2009-09-08|Google Inc.|User interface for facts query engine with snippets from information sources that include query terms and answer terms|

US7765231B2|2005-04-08|2010-07-27|Rathus Spencer A|System and method for accessing electronic data via an image search engine|

US7773822B2|2005-05-02|2010-08-10|Colormax, Inc.|Apparatus and methods for management of electronic images|

US7760917B2|2005-05-09|2010-07-20|Like.Com|Computer-implemented method for performing similarity searches|

US7945099B2|2005-05-09|2011-05-17|Like.Com|System and method for use of images with recognition analysis|

US7809192B2|2005-05-09|2010-10-05|Like.Com|System and method for recognizing objects from images and identifying relevancy amongst images and information|

US7783135B2|2005-05-09|2010-08-24|Like.Com|System and method for providing objectified image renderings using recognition information from images|

US7809722B2|2005-05-09|2010-10-05|Like.Com|System and method for enabling search and retrieval from image files based on recognized information|

US7519200B2|2005-05-09|2009-04-14|Like.Com|System and method for enabling the use of captured images through recognition|

KR100754656B1|2005-06-20|2007-09-03|삼성전자주식회사|Method and system for providing user with image related information and mobile communication system|

US20080005064A1|2005-06-28|2008-01-03|Yahoo! Inc.|Apparatus and method for content annotation and conditional annotation retrieval in a search context|

US7702681B2|2005-06-29|2010-04-20|Microsoft Corporation|Query-by-image search and retrieval system|

JP2007018166A|2005-07-06|2007-01-25|Nec Corp|Information search device, information search system, information search method, and information search program|

JP2007018456A|2005-07-11|2007-01-25|Nikon Corp|Information display device and information display method|

US20070022085A1|2005-07-22|2007-01-25|Parashuram Kulkarni|Techniques for unsupervised web content discovery and automated query generation for crawling the hidden web|

US8666928B2|2005-08-01|2014-03-04|Evi Technologies Limited|Knowledge repository|

US7457825B2|2005-09-21|2008-11-25|Microsoft Corporation|Generating search requests from multimodal queries|

US20090060289A1|2005-09-28|2009-03-05|Alex Shah|Digital Image Search System And Method|

US7876978B2|2005-10-13|2011-01-25|Penthera Technologies, Inc.|Regions of interest in video frames|

US20070098303A1|2005-10-31|2007-05-03|Eastman Kodak Company|Determining a particular person from a collection|

US8849821B2|2005-11-04|2014-09-30|Nokia Corporation|Scalable visual search system simplifying access to network and device functionality|

US7826665B2|2005-12-12|2010-11-02|Xerox Corporation|Personal information retrieval using knowledge bases for optical character recognition correction|

US7725477B2|2005-12-19|2010-05-25|Microsoft Corporation|Power filter for online listing service|

US20070179965A1|2006-01-27|2007-08-02|Hogue Andrew W|Designating data objects for analysis|

US7464090B2|2006-01-27|2008-12-09|Google Inc.|Object categorization for information extraction|

US7555471B2|2006-01-27|2009-06-30|Google Inc.|Data object visualization|

US8874591B2|2006-01-31|2014-10-28|Microsoft Corporation|Using user feedback to improve search results|

US9336333B2|2006-02-13|2016-05-10|Linkedin Corporation|Searching and reference checking within social networks|

US7668405B2|2006-04-07|2010-02-23|Eastman Kodak Company|Forming connections between image collections|

US7917514B2|2006-06-28|2011-03-29|Microsoft Corporation|Visual and multi-dimensional search|

US20080031506A1|2006-08-07|2008-02-07|Anuradha Agatheeswaran|Texture analysis for mammography computer aided diagnosis|

US7934156B2|2006-09-06|2011-04-26|Apple Inc.|Deletion gestures on a portable multifunction device|

JP2008071311A|2006-09-15|2008-03-27|Ricoh Co Ltd|Image retrieval apparatus, image retrieval method, image retrieval program, and information storage medium|

KR100865973B1|2007-02-08|2008-10-30|올라웍스|Method for searching certain person and method and system for generating copyright report for the certain person|

US9058370B2|2007-02-27|2015-06-16|International Business Machines Corporation|Method, system and program product for defining imports into and exports out from a database system using spread sheets by use of a control language|

US8861898B2|2007-03-16|2014-10-14|Sony Corporation|Content image search|

CN101286092A|2007-04-11|2008-10-15|谷歌股份有限公司|Input method editor having a secondary language mode|

US20080267504A1|2007-04-24|2008-10-30|Nokia Corporation|Method, device and computer program product for integrating code-based and optical character recognition technologies into a mobile visual search|

US7917518B2|2007-07-20|2011-03-29|Hewlett-Packard Development Company, L.P.|Compositional balance and color driven content retrieval|

US9275118B2|2007-07-25|2016-03-01|Yahoo! Inc.|Method and system for collecting and presenting historical communication data|

JP5207688B2|2007-08-30|2013-06-12|キヤノン株式会社|Image processing apparatus and integrated document generation method|

US8145660B2|2007-10-05|2012-03-27|Fujitsu Limited|Implementing an expanded search and providing expanded search results|

KR101435140B1|2007-10-16|2014-09-02|삼성전자 주식회사|Display apparatus and method|

US9237213B2|2007-11-20|2016-01-12|Yellowpages.Com Llc|Methods and apparatuses to initiate telephone connections|

US20090144056A1|2007-11-29|2009-06-04|Netta Aizenbud-Reshef|Method and computer program product for generating recognition error correction information|

KR100969298B1|2007-12-31|2010-07-09|인하대학교 산학협력단|Method For Social Network Analysis Based On Face Recognition In An Image or Image Sequences|

US20090237546A1|2008-03-24|2009-09-24|Sony Ericsson Mobile Communications Ab|Mobile Device with Image Recognition Processing Capability|

US8190604B2|2008-04-03|2012-05-29|Microsoft Corporation|User intention modeling for interactive image retrieval|

US8385589B2|2008-05-15|2013-02-26|Berna Erol|Web-based content detection in images, extraction and recognition|

US8406531B2|2008-05-15|2013-03-26|Yahoo! Inc.|Data access based on content of image recorded by a mobile device|

US20090299990A1|2008-05-30|2009-12-03|Vidya Setlur|Method, apparatus and computer program product for providing correlations between information from heterogenous sources|

JP5109836B2|2008-07-01|2012-12-26|株式会社ニコン|Imaging device|

US8520979B2|2008-08-19|2013-08-27|Digimarc Corporation|Methods and systems for content processing|

US8452794B2|2009-02-11|2013-05-28|Microsoft Corporation|Visual and textual query suggestion|

US9087059B2|2009-08-07|2015-07-21|Google Inc.|User interface for presenting search results for multiple regions of a visual query|

US9135277B2|2009-08-07|2015-09-15|Google Inc.|Architecture for responding to a visual query|

US8670597B2|2009-08-07|2014-03-11|Google Inc.|Facial recognition with social network aiding|

US8370358B2|2009-09-18|2013-02-05|Microsoft Corporation|Tagging content with metadata pre-filtered by context|

US20110128288A1|2009-12-02|2011-06-02|David Petrou|Region of Interest Selector for Visual Queries|

US8811742B2|2009-12-02|2014-08-19|Google Inc.|Identifying matching canonical documents consistent with visual query structural information|

US8977639B2|2009-12-02|2015-03-10|Google Inc.|Actionable search results for visual queries|

US9183224B2|2009-12-02|2015-11-10|Google Inc.|Identifying matching canonical documents in response to a visual query|

US9405772B2|2009-12-02|2016-08-02|Google Inc.|Actionable search results for street view visual queries|

US8805079B2|2009-12-02|2014-08-12|Google Inc.|Identifying matching canonical documents in response to a visual query and in accordance with geographic information|

US9176986B2|2009-12-02|2015-11-03|Google Inc.|Generating a combination of a visual query and matching canonical document|

US9852156B2|2009-12-03|2017-12-26|Google Inc.|Hybrid use of location sensor data and visual query to return local listings for visual query|

US8189964B2|2009-12-07|2012-05-29|Google Inc.|Matching an approximately located query image against a reference image set|

US8489589B2|2010-02-05|2013-07-16|Microsoft Corporation|Visual search reranking|US20090327235A1|2008-06-27|2009-12-31|Google Inc.|Presenting references with answers in forums|

US8463053B1|2008-08-08|2013-06-11|The Research Foundation Of State University Of New York|Enhanced max margin learning on multimodal data mining in a multimedia database|

US9135277B2|2009-08-07|2015-09-15|Google Inc.|Architecture for responding to a visual query|

EP2341450A1|2009-08-21|2011-07-06|Mikko Kalervo Väänänen|Method and means for data searching and language translation|

US8121618B2|2009-10-28|2012-02-21|Digimarc Corporation|Intuitive computing methods and systems|

US9176986B2|2009-12-02|2015-11-03|Google Inc.|Generating a combination of a visual query and matching canonical document|

US9405772B2|2009-12-02|2016-08-02|Google Inc.|Actionable search results for street view visual queries|

US9852156B2|2009-12-03|2017-12-26|Google Inc.|Hybrid use of location sensor data and visual query to return local listings for visual query|

CN102782733B|2009-12-31|2015-11-25|数字标记公司|Adopt the method and the allocation plan that are equipped with the smart phone of sensor|

US9197736B2|2009-12-31|2015-11-24|Digimarc Corporation|Intuitive computing methods and systems|

US8600173B2|2010-01-27|2013-12-03|Dst Technologies, Inc.|Contextualization of machine indeterminable information based on machine determinable information|

WO2012050251A1|2010-10-14|2012-04-19|엘지전자 주식회사|Mobile terminal and method for controlling same|

US8861896B2|2010-11-29|2014-10-14|Sap Se|Method and system for image-based identification|

US8995775B2|2011-05-02|2015-03-31|Facebook, Inc.|Reducing photo-tagging spam|

JP5316582B2|2011-05-23|2013-10-16|コニカミノルタ株式会社|Image processing system, image processing device, terminal device, and control program|

EP2533141A1|2011-06-07|2012-12-12|Amadeus S.A.S.|A personal information display system and associated method|

WO2012176317A1|2011-06-23|2012-12-27|サイバーアイ・エンタテインメント株式会社|Image recognition system-equipped interest graph collection system using relationship search|

KR101814120B1|2011-08-26|2018-01-03|에스프린팅솔루션 주식회사|Method and apparatus for inserting image to electrical document|

JP6251906B2|2011-09-23|2017-12-27|ディジマークコーポレイション|Smartphone sensor logic based on context|

US8890827B1|2011-10-05|2014-11-18|Google Inc.|Selected content refinement mechanisms|

US9032316B1|2011-10-05|2015-05-12|Google Inc.|Value-based presentation of user-selectable computing actions|

US10013152B2|2011-10-05|2018-07-03|Google Llc|Content selection disambiguation|

US9305108B2|2011-10-05|2016-04-05|Google Inc.|Semantic selection and purpose facilitation|

US8878785B1|2011-10-05|2014-11-04|Google Inc.|Intent determination using geometric shape input|

US8930393B1|2011-10-05|2015-01-06|Google Inc.|Referent based search suggestions|

US8825671B1|2011-10-05|2014-09-02|Google Inc.|Referent determination from selected content|

US8589410B2|2011-10-18|2013-11-19|Microsoft Corporation|Visual search using multiple visual input modalities|

EP2587745A1|2011-10-26|2013-05-01|Swisscom AG|A method and system of obtaining contact information for a person or an entity|

US8891907B2|2011-12-06|2014-11-18|Google Inc.|System and method of identifying visual objects|

US9355317B2|2011-12-14|2016-05-31|Nec Corporation|Video processing system, video processing method, video processing device for mobile terminal or server and control method and control program thereof|

JP2015062090A|2011-12-15|2015-04-02|日本電気株式会社|Video processing system, video processing method, video processing device for portable terminal or server, and control method and control program of the same|

US10115127B2|2011-12-16|2018-10-30|Nec Corporation|Information processing system, information processing method, communications terminals and control method and control program thereof|

US9165187B2|2012-01-12|2015-10-20|Kofax, Inc.|Systems and methods for mobile image capture and processing|

US10146795B2|2012-01-12|2018-12-04|Kofax, Inc.|Systems and methods for mobile image capture and processing|

US8620021B2|2012-03-29|2013-12-31|Digimarc Corporation|Image-related methods and arrangements|

US10408613B2|2013-07-12|2019-09-10|Magic Leap, Inc.|Method and system for rendering virtual content|

US8935246B2|2012-08-08|2015-01-13|Google Inc.|Identifying textual terms in response to a visual query|

US8868598B2|2012-08-15|2014-10-21|Microsoft Corporation|Smart user-centric information aggregation|

CN102902771A|2012-09-27|2013-01-30|百度国际科技（深圳）有限公司|Method, device and server for searching pictures|

CN102930263A|2012-09-27|2013-02-13|百度国际科技（深圳）有限公司|Information processing method and device|

US8990194B2|2012-11-02|2015-03-24|Google Inc.|Adjusting content delivery based on user submissions of photographs|

US20140149257A1|2012-11-28|2014-05-29|Jim S. Baca|Customized Shopping|

US9298712B2|2012-12-13|2016-03-29|Microsoft Technology Licensing, Llc|Content and object metadata based search in e-reader environment|

CA2900765A1|2013-02-08|2014-08-14|Emotient|Collection of machine learning training data for expression recognition|

US10235358B2|2013-02-21|2019-03-19|Microsoft Technology Licensing, Llc|Exploiting structured content for unsupervised natural language semantic parsing|

US9208176B2|2013-03-12|2015-12-08|International Business Machines Corporation|Gesture-based image shape filtering|

US9258597B1|2013-03-13|2016-02-09|Google Inc.|System and method for obtaining information relating to video images|

US9355312B2|2013-03-13|2016-05-31|Kofax, Inc.|Systems and methods for classifying objects in digital images captured using mobile devices|

US9247309B2|2013-03-14|2016-01-26|Google Inc.|Methods, systems, and media for presenting mobile content corresponding to media content|

US9705728B2|2013-03-15|2017-07-11|Google Inc.|Methods, systems, and media for media transmission and management|

US20140316841A1|2013-04-23|2014-10-23|Kofax, Inc.|Location-based workflows and services|

US20140330814A1|2013-05-03|2014-11-06|Tencent TechnologyCompany Limited|Method, client of retrieving information and computer storage medium|

AU2014271204B2|2013-05-21|2019-03-14|Fmp GroupPty Limited|Image recognition of vehicle parts|

US10176500B1|2013-05-29|2019-01-08|A9.Com, Inc.|Content classification based on data recognition|

GB201314642D0|2013-08-15|2013-10-02|Summerfield Gideon|Image Identification System and Method|

CN104424257A|2013-08-28|2015-03-18|北大方正集团有限公司|Information indexing unit and information indexing method|

CN103455590B|2013-08-29|2017-05-31|百度在线网络技术（北京）有限公司|The method and apparatus retrieved in touch-screen equipment|

WO2015035477A1|2013-09-11|2015-03-19|See-Out Pty Ltd|Image searching method and apparatus|

US10095833B2|2013-09-22|2018-10-09|Ricoh Co., Ltd.|Mobile information gateway for use by medical personnel|

US10127636B2|2013-09-27|2018-11-13|Kofax, Inc.|Content-based detection and three dimensional geometric reconstruction of objects in image and video data|

JP2016538783A|2013-11-15|2016-12-08|コファックス，インコーポレイテッド|System and method for generating a composite image of a long document using mobile video data|

US9491522B1|2013-12-31|2016-11-08|Google Inc.|Methods, systems, and media for presenting supplemental content relating to media content on a content interface based on state information that indicates a subsequent visit to the content interface|

US9411825B2|2013-12-31|2016-08-09|Streamoid Technologies Pvt. Ltd.|Computer implemented system for handling text distracters in a visual search|

US10002191B2|2013-12-31|2018-06-19|Google Llc|Methods, systems, and media for generating search results based on contextual information|

US9456237B2|2013-12-31|2016-09-27|Google Inc.|Methods, systems, and media for presenting supplemental information corresponding to on-demand media content|

US10024679B2|2014-01-14|2018-07-17|Toyota Motor Engineering & Manufacturing North America, Inc.|Smart necklace with stereo vision and onboard processing|

US10248856B2|2014-01-14|2019-04-02|Toyota Motor Engineering & Manufacturing North America, Inc.|Smart necklace with stereo vision and onboard processing|

US10360907B2|2014-01-14|2019-07-23|Toyota Motor Engineering & Manufacturing North America, Inc.|Smart necklace with stereo vision and onboard processing|

US9915545B2|2014-01-14|2018-03-13|Toyota Motor Engineering & Manufacturing North America, Inc.|Smart necklace with stereo vision and onboard processing|

KR101791518B1|2014-01-23|2017-10-30|삼성전자주식회사|Method and apparatus for verifying user|

US9832353B2|2014-01-31|2017-11-28|Digimarc Corporation|Methods for encoding, decoding and interpreting auxiliary data in media signals|

KR101826815B1|2014-02-10|2018-02-07|지니 게엠베하|Systems and methods for image-feature-based recognition|

US9311639B2|2014-02-11|2016-04-12|Digimarc Corporation|Methods, apparatus and arrangements for device to device communication|

US9811592B1|2014-06-24|2017-11-07|Google Inc.|Query modification based on textual resource context|

US9830391B1|2014-06-24|2017-11-28|Google Inc.|Query modification based on non-textual resource context|

US9798708B1|2014-07-11|2017-10-24|Google Inc.|Annotating relevant content in a screen capture image|

US10062099B2|2014-07-25|2018-08-28|Hewlett Packard Enterprise Development Lp|Product identification based on location associated with image of product|

US10024667B2|2014-08-01|2018-07-17|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable earpiece for providing social and environmental awareness|

US9965559B2|2014-08-21|2018-05-08|Google Llc|Providing automatic actions for mobile onscreen content|

JP6220079B2|2014-09-08|2017-10-25|日本電信電話株式会社|Display control apparatus, display control method, and display control program|

US10024678B2|2014-09-17|2018-07-17|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable clip for providing social and environmental awareness|

US9922236B2|2014-09-17|2018-03-20|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable eyeglasses for providing social and environmental awareness|

US9760788B2|2014-10-30|2017-09-12|Kofax, Inc.|Mobile document detection and orientation based on reference object characteristics|

CN104391938B|2014-11-24|2017-10-10|武汉海川云谷软件技术有限公司|A kind of picture batch in physical assets management imports the method and system of database|

CN104615639B|2014-11-28|2018-08-24|北京百度网讯科技有限公司|A kind of method and apparatus for providing the presentation information of picture|

CN104536995B|2014-12-12|2016-05-11|北京奇虎科技有限公司|The method and system of searching for based on terminal interface touch control operation|

CN104572986A|2015-01-04|2015-04-29|百度在线网络技术（北京）有限公司|Information searching method and device|

US11120478B2|2015-01-12|2021-09-14|Ebay Inc.|Joint-based item recognition|

US20160217157A1|2015-01-23|2016-07-28|Ebay Inc.|Recognition of items depicted in images|

US10490102B2|2015-02-10|2019-11-26|Toyota Motor Engineering & Manufacturing North America, Inc.|System and method for braille assistance|

US9586318B2|2015-02-27|2017-03-07|Toyota Motor Engineering & Manufacturing North America, Inc.|Modular robot with smart device|

US9811752B2|2015-03-10|2017-11-07|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable smart device and method for redundant object identification|

US9760792B2|2015-03-20|2017-09-12|Netra, Inc.|Object detection and classification|

US9922271B2|2015-03-20|2018-03-20|Netra, Inc.|Object detection and classification|

US9972216B2|2015-03-20|2018-05-15|Toyota Motor Engineering & Manufacturing North America, Inc.|System and method for storing and playback of information for blind users|

CN104794220A|2015-04-28|2015-07-22|百度在线网络技术（北京）有限公司|Information search method and information search device|

US9703541B2|2015-04-28|2017-07-11|Google Inc.|Entity action suggestion on a mobile device|

KR101690528B1|2015-06-05|2016-12-28|오드컨셉 주식회사|Method, apparatus and computer program for displaying serch information|

US10062015B2|2015-06-25|2018-08-28|The Nielsen Company , Llc|Methods and apparatus for identifying objects depicted in a video using extracted video frames in combination with a reverse image search engine|

KR20180021669A|2015-06-26|2018-03-05|로비 가이드스, 인크.|System and method for automatic formatting of images for media assets based on user profiles|

WO2017000109A1|2015-06-29|2017-01-05|北京旷视科技有限公司|Search method, search apparatus, user equipment, and computer program product|

US10769200B1|2015-07-01|2020-09-08|A9.Com, Inc.|Result re-ranking for object recognition|

US10242285B2|2015-07-20|2019-03-26|Kofax, Inc.|Iterative recognition-guided thresholding and data extraction|

CN105069083B|2015-07-31|2019-03-08|小米科技有限责任公司|The determination method and device of association user|

US20180322208A1|2015-08-03|2018-11-08|Orand S.A.|System and method for searching for products in catalogs|

US9898039B2|2015-08-03|2018-02-20|Toyota Motor Engineering & Manufacturing North America, Inc.|Modular smart necklace|

US10970646B2|2015-10-01|2021-04-06|Google Llc|Action suggestions for user-selected content|

US11055343B2|2015-10-05|2021-07-06|Pinterest, Inc.|Dynamic search control invocation and visual search|

JP6204957B2|2015-10-15|2017-09-27|ヤフー株式会社|Information processing apparatus, information processing method, and information processing program|

US20180004845A1|2015-10-16|2018-01-04|Carlos A. Munoz|Web Based Information Search Method|

US10178527B2|2015-10-22|2019-01-08|Google Llc|Personalized entity repository|

US10055390B2|2015-11-18|2018-08-21|Google Llc|Simulated hyperlinks on a mobile device based on user intent and a centered selection of text|

US20170185670A1|2015-12-28|2017-06-29|Google Inc.|Generating labels for images associated with a user|

US9881236B2|2015-12-28|2018-01-30|Google Llc|Organizing images associated with a user|

US10043102B1|2016-01-20|2018-08-07|Palantir Technologies Inc.|Database systems and user interfaces for dynamic and interactive mobile image analysis and identification|

US9779293B2|2016-01-27|2017-10-03|Honeywell International Inc.|Method and tool for post-mortem analysis of tripped field devices in process industry using optical character recognition and intelligent character recognition|

US10024680B2|2016-03-11|2018-07-17|Toyota Motor Engineering & Manufacturing North America, Inc.|Step based guidance system|

US11003667B1|2016-05-27|2021-05-11|Google Llc|Contextual information for a displayed resource|

US9958275B2|2016-05-31|2018-05-01|Toyota Motor Engineering & Manufacturing North America, Inc.|System and method for wearable smart device communications|

US10152521B2|2016-06-22|2018-12-11|Google Llc|Resource recommendations for a displayed resource|

US10353950B2|2016-06-28|2019-07-16|Google Llc|Visual recognition using user tap locations|

US10802671B2|2016-07-11|2020-10-13|Google Llc|Contextual information for a displayed resource that includes an image|

US10561519B2|2016-07-20|2020-02-18|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable computing device having a curved back to reduce pressure on vertebrae|

US10489459B1|2016-07-21|2019-11-26|Google Llc|Query recommendations for a displayed resource|

US10051108B2|2016-07-21|2018-08-14|Google Llc|Contextual information for a notification|

US10467300B1|2016-07-21|2019-11-05|Google Llc|Topical resource recommendations for a displayed resource|

CA3034661A1|2016-09-06|2018-03-15|Walmart Apollo, Llc|Product part picture picker|

US10949605B2|2016-09-13|2021-03-16|Bank Of America Corporation|Interprogram communication with event handling for online enhancements|

US10212113B2|2016-09-19|2019-02-19|Google Llc|Uniform resource identifier and image sharing for contextual information display|

US10535005B1|2016-10-26|2020-01-14|Google Llc|Providing contextual actions for mobile onscreen content|

US10432851B2|2016-10-28|2019-10-01|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable computing device for detecting photography|

US10012505B2|2016-11-11|2018-07-03|Toyota Motor Engineering & Manufacturing North America, Inc.|Wearable system for providing walking directions|

US10521669B2|2016-11-14|2019-12-31|Toyota Motor Engineering & Manufacturing North America, Inc.|System and method for providing guidance or feedback to a user|

US11237696B2|2016-12-19|2022-02-01|Google Llc|Smart assist for repeated actions|

US20180218237A1|2017-01-30|2018-08-02|International Business Machines Corporation|System, method and computer program product for creating a contact group using image analytics|

JP6807268B2|2017-04-18|2021-01-06|日本電信電話株式会社|Image recognition engine linkage device and program|

US11232305B2|2017-04-28|2022-01-25|Samsung Electronics Co., Ltd.|Method for outputting content corresponding to object and electronic device therefor|

JP6353118B1|2017-05-10|2018-07-04|ヤフー株式会社|Display program, information providing apparatus, display apparatus, display method, information providing method, and information providing program|

US10679068B2|2017-06-13|2020-06-09|Google Llc|Media contextual information from buffered media data|

US10652592B2|2017-07-02|2020-05-12|Comigo Ltd.|Named entity disambiguation for providing TV content enrichment|

EP3584717A1|2017-08-01|2019-12-25|Samsung Electronics Co., Ltd.|Electronic device and method for providing search result thereof|

EP3602321A1|2017-09-13|2020-02-05|Google LLC|Efficiently augmenting images with related content|

US11126653B2|2017-09-22|2021-09-21|Pinterest, Inc.|Mixed type image based search results|

KR20190047214A|2017-10-27|2019-05-08|삼성전자주식회사|Electronic device and method for controlling the electronic device thereof|

US11062176B2|2017-11-30|2021-07-13|Kofax, Inc.|Object detection and image cropping using a multi-detector approach|

KR102068535B1|2018-02-28|2020-01-21|엔에이치엔 주식회사|Method for schedule a service based on chat messages|

JP6684846B2|2018-04-23|2020-04-22|株式会社ワコム|Article search system|

CN108897841A|2018-06-27|2018-11-27|百度在线网络技术（北京）有限公司|Panorama sketch searching method, device, equipment, server and storage medium|

US10699112B1|2018-09-28|2020-06-30|Automation Anywhere, Inc.|Identification of key segments in document images|

JP6934855B2|2018-12-20|2021-09-15|ヤフー株式会社|Control program|

KR101982990B1|2018-12-27|2019-05-27|건국대학교 산학협력단|Method and apparatus for questioning and answering using chatbot|

KR101982991B1|2018-12-28|2019-05-27|건국대학교 산학협력단|Method and apparatus for questioning and answering using a plurality of chatbots|

KR102245774B1|2019-11-06|2021-04-27|연세대학교 산학협력단|Visual Question Answering Apparatus Using Fair Classification Network and Method Thereof|

KR102368560B1|2020-01-31|2022-02-25|연세대학교 산학협력단|Visual Question Answering Apparatus Using Selective Residual Learning and Method Thereof|

KR102104246B1|2020-02-17|2020-04-24|주식회사 비에이템|Image search system using screen segmentaion|

法律状态:
2018-01-09| B25D| Requested change of name of applicant approved|Owner name: GOOGLE LLC (US) |

2019-01-15| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2019-07-23| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|

2020-04-07| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2020-06-09| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 05/08/2010, OBSERVADAS AS CONDICOES LEGAIS. |

2020-10-06| B16C| Correction of notification of the grant [chapter 16.3 patent gazette]|Free format text: REF. RPI 2579 DE 09/06/2020 QUANTO AO QUADRO REIVINDICATORIO. |

优先权:

申请号 | 申请日 | 专利标题

US23239709P| true| 2009-08-07|2009-08-07|

US26611609P| true| 2009-12-02|2009-12-02|

US12/850,483|US9135277B2|2009-08-07|2010-08-04|Architecture for responding to a visual query|

PCT/US2010/044603|WO2011017557A1|2009-08-07|2010-08-05|Architecture for responding to a visual query|

[返回顶部]